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PREFACE 


What this book is about. The theory of sets is a vibrant, exciting mathematical 
theory, with its own basic notions, fundamental results and deep open prob- 
lems, and with significant applications to other mathematical theories. At the 
same time, axiomatic set theory is often viewed as a foundation of mathematics'. 
it is alleged that all mathematical objects are sets, and their properties can be 
derived from the relatively few and elegant axioms about sets. Nothing so 
simple-minded can be quite true, but there is little doubt that in standard, 
current mathematical practice, “making a notion precise” is essentially syn- 
onymous with “defining it in set theory”. Set theory is the official language of 
mathematics, just as mathematics is the official language of science. 

Like most authors of elementary, introductory books about sets, I have 
tried to do justice to both aspects of the subject. 

From straight set theory, these Notes cover the basic facts about “abstract 
sets”, including the Axiom of Choice, transfinite recursion, and cardinal and 
ordinal numbers. Somewhat less common is the inclusion of a chapter on 
“pointsets” which focuses on results of interest to analysts and introduces 
the reader to the Continuum Problem, central to set theory from the very 
beginning. There is also some novelty in the approach to cardinal numbers, 
which are brought in very early (following Cantor, but somewhat deviously) , 
so that the basic formulas of cardinal arithmetic can be taught as quickly as 
possible. Appendix A gives a more detailed “construction” of the real numbers 
than is common nowadays, which in addition claims some novelty of approach 
and detail. Appendix B is a somewhat eccentric, mathematical introduction 
to the study of natural models of various set theoretic principles, including 
Aczel’s Antifoundation. It assumes no knowledge of logic, but should drive 
the serious reader to study it. 

About set theory as a foundation of mathematics, there are two aspects of 
these Notes which are somewhat uncommon. First, I have taken seriously 
this business about “everything being a set” (which of course it is not) and 
have tried to make sense of it in terms of the notion of faithful representation 
of mathematical objects by structured sets. An old idea, but perhaps this 
is the first textbook which takes it seriously, tries to explain it, and applies 
it consistently. Those who favor category theory will recognize some of its 
basic notions in places, shamelessly folded into a traditional set theoretical 
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approach to the foundations where categories are never mentioned. Second, 
computation theory is viewed as part of the mathematics “to be founded" 
and the relevant set theoretic results have been included, along with several 
examples. The ambition was to explain what every young mathematician or 
theoretical computer scientist needs to know about sets. 

The book includes several historical remarks and quotations which in some 
places give it an undeserved scholarly gloss. All the quotations (and most 
of the comments) are from papers reprinted in the following two. marvellous 
and easily accessible source books, which should be perused by all students 
of set theory: 

Georg Cantor, Contributions to the founding of the theory of transfinite 
numbers, translated and with an Introduction by Philip E. B. Jourdain. Dover 
Publications. New York. 

Jean van Heijenoort, From Frege to Godel, Harvard University Press, Cam- 
bridge. 1967. 

How to use it. About half of this book can be covered in a Quarter (ten 
weeks), somewhat more in a longer Semester. Chapters 1-6 cover the 
beginnings of the subject and they are written in a leisurely manner, so that 
the serious student can read through them alone, with little help. The trick 
to using the Notes successfully in a class is to cover these beginnings very 
quickly: skip the introductory Chapter 1, which mostly sets notation; spend 
about a week on Chapter 2, which explains Cantor’s basic ideas; and then 
proceed with all deliberate speed through Chapters 3 - 6, so that the theory 
of well ordered sets in Chapter 7 can be reached no later than the sixth week, 
preferably the fifth. Beginning with Chapter 7. the results are harder and the 
presentation is more compact. How much of the “real” set theory in Chapters 
7-12 can be covered depends, of course, on the students, the length of the 
course, and what is passed over. If the class is populated by future computer 
scientists, for example, then Chapter 6 on Fixed Points should be covered in 
full, with its problems, but Chapter 10 on Baire Space might be omitted, sad 
as that sounds. For budding young analysts, at the other extreme, Chapter 
6 can be cut off after 6.27 (and this too is sad), but at least part of Chapter 
10 should be attempted. Additional material which can be left out. if time is 
short, includes the detailed development of addition and multiplication on the 
natural numbers in Chapter 5, and some of the less central applications of the 
Axiom of Choice in Chapter 9. The Appendices are quite unlikely to be taught 
in a course (I devote just one lecture to explain the idea of the construction 
of the reals in Appendix A) , though I would like to think that they might be 
suitable for undergraduate Honors Seminars, or individual reading courses. 

Since elementary courses in set theory are not offered regularly and they 
are seldom long enough to cover all the basics, I have tried to make these 
N otes accessible to the serious student who is studying the subject on their 
own. There are numerous, simple Exercises strewn throughout the text, which 
test understanding of new notions immediately after they are introduced. In 
class I present about half of them, as examples, and I assign some of the rest 



Preface 


ix 


for easy homework. The Problems at the end of each chapter vary widely in 
difficulty, some of them covering additional material. The hardest problems 
are marked with an asterisk (*). 

Acknowledgments. I am grateful to the Mathematics Department of the 
University of Athens for the opportunity to teach there in Fall 1990, when I 
wrote the first draft of these N otes, and especially to Prof. A. Tsarpalias who 
usually teaches that Set Theory course and used a second draft in Fall 1991; 
and to Dimitra Kitsiou and Stratos Paschos for struggling with PCs and laser 
printers at the Athens Polytechnic in 1990 to produce the first “hard copy” 
version. I am grateful to my friends and colleagues at UCLA and Caltech 
(hotbeds of activity in set theory) from whom I have absorbed what 1 know of 
the subject, over many years of interaction. I am especially grateful to my wife 
Joan Moschovakis and my student Darren Kessner for reading large parts of 
the preliminary edition, doing the problems and discovering a host of errors; 
and to Larry Moss who taught out of the preliminary edition in the Spring 
Term of 1993, found the remaining host of errors and wrote out solutions to 
many of the problems. 

The book was written more-or-less simultaneously in Greek and English, by 
the magic of bilingual DTpXand in true reflection of my life. I have dedicated it 
to Prof. Nikos Kritikos (a student of Caratheodory), in fond memory of many 
unforgettable hours he spent with me back in 1973, patiently teaching me how 
to speak and write mathematics in my native tongue, but also much about the 
love of science and the nature of scholarship. In this connection, I am also 
greatly indebted to Takis Koufopoulos, who read critically the preliminary 
Greek version, corrected a host of errors and made numerous suggestions 
which (I believe) improved substantially the language of the final Greek draft. 


Palaion Phaliron, Greece November 1993 

About the 2nd edition. Perhaps the most important changes I have made 
are in small things, which (I hope) will make it easier to teach and learn from 
this book: simplifying proofs, streamlining notation and terminology, adding 
a few diagrams, rephrasing results (especially those justifying definition by 
recursion) to ease their applications, and, most significantly, correcting errors, 
typographical and other. For spotting these errors and making numerous, 
useful suggestions over the years, I am grateful to Serge Bozon, Joel Hamkins, 
Peter Hinman, Aki Kanamori, Joan Moschovakis, Larry Moss, Thanassis 
Tsarpalias and many, many students. 

The more substantial changes include; 

— A proof of Suslin’s Theorem in Chapter 10, which has also been signifi- 
cantly massaged. 

— A better exposition of ordinal theory in Chapter 12 and the addition of 
some material, including the basic facts about ordinal arithmetic. 
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— The last chapter, a compilation of solutions to the Exercises in the 
main part of the book - in response to popular demand. This eliminates the 
most obvious, easy homework assignments, and so I have added some easy 
problems. 

I am grateful to Thanos Tsouanas, who copy-edited the manuscript and 
caught the worst of my mistakes. 

Palaion Phaliron. Greece July 2005 
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CHAPTER 1 


INTRODUCTION 


Mathematicians have always used sets, e.g., the ancient Greek geometers 
defined a circle as the set of points at a fixed distance r from a fixed point C, 
its center. But the systematic study of sets began only at the end of the 19th 
century with the work of the great German mathematician Georg Cantor, 
who created a rigorous theory of the concept of completed infinite by which 
we can compare infinite sets as to size. For example, let 

N = {0, 1, . . . } = the set of natural numbers, 

Z={... .—1,0, 1,...} = the set of rational integers. 

Q = the set of rational numbers (fractions), 

R = the points of a straight line, 

where we also identify R with the set of real numbers, each point associated 
with its (positive or negative) coordinate with respect to a fixed origin and 
direction. Cantor asked if these four sets “have the same (infinite) number 
of elements”, or if one of them is “more numerous” than the others. Before 
we make precise and answer this question in the next chapter, we review here 
some basic, well-known facts about sets and functions, primarily to explain 
the notation we will be using. 

What are sets, anyway? The question is like “what are points”, which Euclid 
answered with 

a point is that which has no parts. 

This is not a rigorous mathematical definition, a reduction of the concept of 
“point” to other concepts which we already understand, but just an intuitive 
description which suggests that a point is some thing which has no extension 
in space. Like that of point, the concept of set is fundamental and cannot be 
reduced to other, simpler concepts. Cantor described it as follows: 

By a set we are to understand any collection into a whole of definite 
and separate objects of our intuition or our thought. 

Vague as it is, this description implies two basic properties of sets. 

1 . Every set A has elements or members. We write 

x € A <=> the object x is a member of (or belongs to) A. 


l 
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2. A set is determined by its members, i.e., if A, B are sets, then 1 

A = B <==>■ A and B have the same members (1-1) 

(Vx)[x G A x € B], 

This last is the Extensionality Property. For example, the set of students in 
this class will not change if we all switch places, lie down or move to another 
classroom; this set is completely determined by who we are, not our posture 
or the places where we happen to be. 

Somewhat peculiar is the empty set 0 which has no members. The exten- 
sionality property implies that there is only one empty set. 

If A and B are sets, we write 

A C B (Vx)[x € A =£> x £ B ], 

and if A C B . we call A a subset of B . so that for every B. 

0CB, B C B. 

A proper subset of B is a subset distinct from B, 

AC B 4=> [ACB&A^B], 

From the extensionality property it follows that for all sets A. B, 

A = B A C B&B C A. 


We have already used several different notations to define specific sets and 
we need still more, e.g., 

A = {a\,a 2 , .... a„} 

is the (finite) set with members the objects a\, ai, . . . , a„. If P is a condition 
which specifies some property of objects, then 

A = {x | P(x)} 

is the set of all objects which satisfy the condition P, so that for all x, 

x £ A P(x). 


For example, if 

P(x) x e N& x is even, 

then{x | P(x)j is the set of all even, natural numbers. We use a variant of this 
notation when we are only interested in “collecting into a whole” members of 
a given set A which satisfy a certain condition; 

{x G A | P{x)} =df {x | x e A8iP(x)}, 


1 We will use systematically, as abbreviations, the logical symbols 

& : and, V : or, -> : not, => : implies, •$=*• : if and only if, 

V : for all. 3 : (here exists, 3! : there exists exactly one. 

The symbols =df and are rea d “equal by definition” and “equivalent by definition”. 
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Figure 1.1. The Boolean operations. 

so that, for example, {x G N | x > 0} is the set of all non-zero natural 
numbers, while {x G R. | x > 0} is the set of all positive real numbers. 

For any two sets 2 A, B, 

A U B = {x | x G A V x € B} (the union of A, B ), 

A n B = {x G A | x G B} (the intersection of A, B). 

A \ B — {x G A | x B} (the difference of A, B). 

These “Boolean operations” are illustrated in the so-called Venn diagrams of 
Figure 1.1, in which sets are represented by regions in the plane. The union 
and the intersection of infinite sequences of sets are defined in the same way, 

= A 0 U A t U • • • = {x | (3« G N)[x G A„]}, 

n„°V» = A 0 n Ai n • • • = {x | (V/7 G N)[x G A „\ }. 

Two sets are disjoint if their intersection is empty, 

A is disjoint from B •<=>■ A n B = 0. 

We will use the notations 

/ : X -> Y or A -4 B 

to indicate that / is a function which associates with each member x of the 
set X , the domain of / some member /(x) of the range Y of /. Functions 
are also called mappings, operations, transformations and many other things. 
Sometimes it is convenient to use the abbreviated notation (x i— > / (x)) which 
makes it possible to talk about a function without officially naming it. For 
example. 

(x i— > x 2 + 1) 

is the function on the real numbers which assigns to each real its square 
increased by 1; if we call it /. then it is defined by the formula 

/ (x) = x 2 + 1 (x G M) 


2 In “mathematical English”, when we say “for any two objects x, y”, we do not mean that 
necessarily x ^ y , e.g.. the assertion that “for any two numbers x, y , (x + y ) 2 = x 2 + 2xy + y 2 ” 
implies that “for every number x, (x + x) 2 = x 2 + 2xx + x 2 ”. 
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so that /(0) = 1, /(2) = 5, etc. But we can say “all the values of (x i— > x 2 + 1) 
are positive reals” without necessarily fixing a name for it. like /. 

Two functions are equal if they have the same domain and they assign the 
same value to every member of their common domain. 

/ = g ^ (Vx G X)[f(x) = g(x)] (/ : X -> Y, g : X -+ Z, x G X). 
In connection with functions we will also use the notations 
/ : X >-* Y / is an injection (one-to-one) 

4=> (Vx, x' G X)[f{x) = /(x , )=>x = x'\, 

f : X — » Y -^==>df / is a surjection (onto) 

(VjG Y){3x&X)[f{x) = yl 


f : X >— » Y G=G*df / is a bijection or a correspondence 

(VjG Y){3\x&X)[f{x) = y}. 

For every / : X — ► Y and A C X , the set 

f[A] =df {fix) | x eAj 
is the image of A under /, and if B C Y, then 

f~ l [B] =df {x G X \ f(x) £ B} 
is the pre-image of B by /. 

If / is a bijection. then we can define the inverse function / -1 : Y — » X by 
the condition 

f~\y) = x f( x )=y, 

and then the inverse image f~ x [B] (as we defined it above) is precisely the 
image of B under / -1 . 

The composition 

h = df gf X —> Z 

of two functions 

X -4 Y z 

is defined by 

h(x)=g(f(x)) (.tel). 

It is easy to prove many basic properties of sets and functions using only 
these definitions and the extensionality property. For example, 

AU B = B U A, 


because, for any x, 


x G A U B 


x € A or x G B 
x G B or x G A 
x G B U A. 


In some cases, the logic of the argument gets a bit complex and it is easier 
to prove an identity U = V by verifying separately the two implications 
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x G U => x £ V and x G V => x G U. For example, if / : X — > Y and 
A, B C X, then 

f[AUB] = f[A\Uf[B]. 

To prove this, we show first that 

xef[AUB]=>xef[A\Uf[B]-, 

this holds because if x G f[A U B\. then there is some y G A U B such that 
x = f{y)\ and if y G A, then x = f(y) G f[A\ C f[A\ U f[B], while if 
y G B, then x = f{y) G f[B] C f\A\ U f[B], Next we show the converse 
implication, that 

xG/[d]U/[5]^xG/[dU5]; 

this holds because if x G f[A\, then x = f(y) for some y G A C A U B, and 

so x G f[A U B], while if x G f[B], then x = / (j) for some y G B C A U B, 

and so, again, x G f[A U B], 

Problems for Chapter 1 

xl.l. For any three sets A, B, C, 

A U {B n C) = (A U B) n (A u c), 

A n (B u c) = (a n b) u (A n c), 

A \ (A n B) = A \ B. 

xl.2. (De Morgan’s laws) For any three sets A, B, C, 

C\(dUB) = (C\d)n(C\B), 

C\(AnB) = (C\A)u(C\B). 

xl.3. (De Morgan’s laws for sequences) For any set C and any sequence of 
sets { A„}„ = A 0 ,A \, . . . , 

C\(U„^)=D„(C\^), 

c\(n„^) = u„(c\^). 


xl.4. For every injection / : X >— » F, and all A, B C X, 
f[AnB] = f[A\nf[B], 
f[A\B] = f[A]\f[B]. 

Show also that these identities do not always hold if / is not an injection. 
xl.5. For every / : X — > Y, and all A, B C Y , 

f- 1 [AUB] = f- l [A]l)f~ 1 [Bl 
f- 1 [AnB] = f- l [A]nf~ l [B], 
f~ l [A\B] = f-\A\\f~\B]. 
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xl.6. For every f : X —> Y and all sequences of sets A„ C X, B„ C F, 

/- I [U^] = U^o/- 1 [^»]. 

/- 1 [nr=o^]=nr=o/- 1 [^]- 

/[ur=o^]=ur= 0 /M»]- 

xl.7. For every injection f : X >—> Y and every sequence of sets A„ C X, 

/[nr=o^] = nr= 0 /M«]- 

xl.8. The composition of injections is an injection, the composition of sur- 
jections is a surjection, and hence the composition of bijections is a bijection. 
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After these preliminaries, we can formulate the fundamental definitions of 
Cantor about the size or cardinality of sets. 

2.1. Definition. Two sets A, B are equinumerous or equal in cardinality if there 
exists a (one-to-one) correspondence between their elements, in symbols 

A= C B <=*# (3 /)[/ : A ^ B], 

This definition of equinumerosity stems from our intuitions about finite 
sets, e.g., we can be sure that a shoe store offers for sale the same number 
of left and right shoes without knowing exactly what that number is: the 
correspondence of each left shoe with the right shoe in the same pair estab- 
lishes the equinumerosity of these two sets. The radical element in Cantor’s 
definition is the proposal to accept the existence of such a correspondence as 
the characteristic property of equinumerosity for all sets, despite the fact that 
its application to infinite sets leads to conclusions which had been viewed as 
counterintuitive. A finite set. for example, cannot be equinumerous with one 
of its proper subsets, while the set of natural numbers N is equinumerous with 
N \ {0} via the correspondence (x i-> x + 1), 

{0,1, 2, ...}= c {1,2,3,...}. 

In the real numbers, also, 

(0.1) = c (0.2) 

via the correspondence (x i— > 2x), where as usual, for any two reals a < [l 
{a, ft) = {x € R. | a < x < /?}. 

We will use the analogous notation for the closed and half-closed intervals 
[aj],[aj),e tc. 

2.2. Proposition. For all sets A, B,C, 

A = c A, 

if A = c B. then B = c A, 
if ( A = c B &B = c C), then A = c C. 

Proof. To show the third implication as an example, suppose that the 
bijections / : A >— » B and g : B >-» C witness the equinumerosities of the 
hypothesis; their composition gf : A >-» C then witnesses that A = c C. 3 
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n : 3 1 0 / 5 / 9 7 

111 I II 

/( 0) /(l) /( 2) / (3) /(4) / (5) 

Figure 2.1. Deleting repetitions. 

2.3. Definition. The set A is less than or equal to B in size if it is equinumerous 
with some subset of B, in symbols: 

A < c B (3 C)[C C B & A = c C], 

2.4. Proposition. A < c B (3 /)[/ : A >— > B ]. 

Proof. If 4 = c C C B and f : A C witnesses this equinumerosity, 
then / is an injection from A into B. Conversely, if there exists an injection 
/ : A >— > B, then the same / is a bijection of A with its image f[A\, so that 
A =c f[A\ G B and so A B by the definition. I 

2.5. Exercise. For all sets A. B.C, 


A < t A, 

if (A < c B & B < c C), then A < c C. 


2.6. Definition. A set A is finite if there exists some natural number n such 
that 

A = c {i G N | i < n} = {0, 1, . . . , n — 1}, 

otherwise A is infinite. (Thus the empty set is finite, since 0 = {/ e N | i < 0}.) 

A set A is countable if it is finite or equinumerous with the set of natu- 
ral numbers N. otherwise it is uncountable. Countable sets are also called 
denumerable, and correspondingly, uncountable sets are non-denumerable. 

2.7. Proposition. The following are equivalent for every set A: 

(1) A is countable. 

(2) A < c N. 

(3) Either A = 0, or A has an enumeration, a surjection n : N — » A, so that 

A = ;c[N] = {7t(0),w(l),w(2),... }. 

Proof. We give what is known as a “round robin proof”. 

(1) => (2). If A is countable, then either A = c { ; 6 N | i < n} for some n 
or A = c N, so that, in either case, A = c C for some CCN and hence A < c N. 

(2) => (3). Suppose A f (/), choose some xo G A, and assume by (2) that 
/ : A >-> N. For each i e N, let 

n < i \ = { x 0’ if iif\A\, 

\ f~ l (i), otherwise, i.e., if i G f\A\. 
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The definition works (because / is an injection, and so /“'(/) is uniquely 
determined in the second case) , and it defines a surjection n : N — » A , because 
xo G A and for every x € A, x = (x)). 

(3) => (1). If A is finite then (1) is automatically true, so assume that 
A is infinite but it has an enumeration n : N — » A. We must find another 
enumeration / : N — » A which is without repetitions, so that it is in fact a 
bijection of N with A. and hence A = c N. The proof is suggested by Figure 
2.1: we simply delete the repetitions from the given enumeration n of A. To 
get a precise definition of / by recursion, notice that because A is not finite, 

for every finite sequence cio a n of members of A there exists some m such 

that 7 x{m) {ciq, . . . , a n }. Set 

/( 0) = 7r(0), 

m n = the least m such that n{m) ^ {/( 0) / («)}, 

/(« + 1) = n{m„). 

It is obvious that / is an injection, so it is enough to verify that every x G A 
is a value of /, i.e., that for every n G N, n{n) G /[N], This is immediate for 
0, since 7i(0) = /( 0). If x = n{n + 1) for some n and x G {/(0), . . . ,/(«)}, 
then x = / (i) for some i < n\ and if x ^ {/ (0), . . . , / («)}, then m„ = 77 + 1 
and f(n + 1) = n(m n ) = x by the definition. H 

2.8. Exercise. If A is countable and there exists an injection f : B >— > A, then 
B is also countable ; in particular, every subset of a countable set is countable. 

2.9. Exercise. If A is countable and there exists a surjection f : A —» B, then 
B is also countable. 

The next, simple theorem is one of the most basic results of set theory. 

2.10. Theorem (Cantor). For each sequence Aq,A\, ... of countable sets, the 
union 

A = \JZ 0 An = A 0 UA l U... 

is also a countable set. 

In particular, the union All B of two countable sets is countable. 

Proof. The second claim follows by applying the first to the sequence 

A. B. B, - ■ ■ 

For the first, it is enough (why?) to consider the special case where none 
of the A„ is empty, in which case we can find for each A„ an enumeration 
n n : N — » A n . If we let 

a." = 7 z"(i) 

to simplify the notation, then for each n 

A„ = {a’faf...}. 

and we can construct from these enumerations a table of elements which lists 
all the members of the union A. This is pictured in Figure 2.2, and the arrows 
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A 0 : 

Ai : 

A-2 : 

Figure 2. 

in that picture show how to enumerate the union: 

A = {oq, , Oq, a\ }. H 

2.11. Corollary. The set of rational ( positive and negative) integers 

Z = {... — 2, —1, 0, 1, 2, ... } 

is countable. 

Proof. Z = NU {— 1 , -2, . . . , } and the set of negative integers is countable 
via the correspondence (j h -(j + 1)). H 

2.12. Corollary. The set <Q> of rational numbers is countable. 

Proof. The set Q + of non-negative rationals is countable because 

Q + = Ur=i{^l™eN} 

and each | m e N} is countable via the enumeration (m ^). The set 
Q“ of negative rationals is countable by the same method, and then the union 
Q + U Q - is countable. H 

This corollary was Cantor’s first significant result in the program of classifi- 
cation of infinite sets by their size, and it was considered somewhat “paradoxi- 
cal” because Q appears to be so much larger than N. Immediately afterwards, 
Cantor showed the existence of uncountable sets. 

2.13. Theorem (Cantor). The set of infinite , binary sequences 

A = {(ao, a\,. .. , ) | (V/)[a,- = 0 V a,- = 1]} 

is uncountable. 

Proof. Suppose (towards a contradiction) that A is countable, so there 
exists an enumeration 

A = {ao, a\ }, 

where for each n. 

a n = (a'fiaf...) 



.2. Cantor’s first diagonal method. 
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«o : 


a\ : 


0-2 '■ 



Figure 2.3. Cantor’s second diagonal method. 

is a sequence of 0’s and l’s. 3 We construct a table with these sequences as 
before, and then we define the sequence p by interchanging 0 and 1 in the 
“diagonal” sequence a[j, a},... : 

Pin) = 1 - a*. 

It is obvious that for each a n , P f a n , since 

Pin) = 1 - a„(n) f a„{n), 

so that the sequence a 0 , a\, . . . does not enumerate the entire A. contrary to 
our hypothesis. H 

2.14. Corollary (Cantor). The set R. of real numbers is uncountable. 

Proof. We define first a sequence of sets Co, C\, . . . , of real numbers which 
satisfy the following conditions: 

1. Co = [0,1]. 

2. Each C n is a union of 2" closed intervals and 

Co 2 Cj D ■ ■ ■ C n D C n+ 1 D • • • . 

3. C „ + 1 is constructed by removing the (open) middle third of each interval 
in C n . i.e., by replacing each [a, b ] in C„ by the two closed intervals 

L[a. b ] = [a. a + j(fe — a)\, 

2 , 

R[a. b ] = [a + —(b — a), b}. 

With each binary sequence <5 G A we associate now a sequence of closed 
intervals, 

j?S t?5 


’To prove a proposition 9 by the method of reduction to a contradiction, we assume its negation 
-■I? and derive from that assumption something which violates known facts, a contradiction, 
something absurd: we conclude that 9 cannot be false, so it must be true. Typically we will begin 
such arguments with the code-phrase towards a contradiction, which alerts the reader that the 
supposition which follows is the negation of what we intend to prove. 
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Co 
Ci 
C 2 
C 3 

Figure 2.4. The first four stages of the Cantor set construction. 

by the following recursion: 

*o = Co = [0,1], 
s J LF%, if<5(«) = 0, 

" +1 \RFf 1 ,iiS{n) = l. 

By induction, for each n, F% is one of the closed intervals of C„ of length 3“" 
and obviously 

Fo^FfD---, 

so by the fundamental completeness property of the real numbers the intersec- 
tion of this sequence is not empty; in fact, it contains exactly one real number, 
call it 

f{S) = the unique element in the intersection p| ^L 0 F^. 

The function / maps the uncountable set A into the set 

c = n“ 0 c,„ 

the so-called Cantor set, so to complete the proof it is enough to verify that 
/ is one-to-one. But if n is the least number for which 3(rt) / e(n) and (for 
example) 3{n) = 0, we have F% = F~ from the choice of n. 

f{S) G F d n+l = LFl /(e) G Fj +1 = RF s n , and LF* n RF* = 0. 

so that indeed / is an injection. H 

The basic mathematical ingredient of this proof is the appeal to the com- 
pleteness property of the real numbers, which we will study carefully in Ap- 
pendix A. Some use of a special property of the reals is necessary; the rest 
of Cantor’s construction relies solely on arithmetical properties of numbers 
which are also true of the rationals, so if we could avoid using completeness 
we would also prove that Q is uncountable, contradicting Corollary 2.12. 

The fundamental importance of this theorem was instantly apparent, the 
more so because Cantor used it immediately in a significant application to the 
theory of algebraic numbers. Before we prove this corollary we need some 
definitions and lemmas. 

2.15. Definition. For any two sets A. B, the set of ordered pairs of members 
of A and members of B is denoted by 

Ax B = {(x, y) | -v G A&y G Bj. 



Chapter 2. Equinumerosity 


13 


In the same way, for each n > 2, 

A\ x • • • x A n = {(xi, . . . ,x n ) | at G A\, . . . , x n G A n }, 

A n = {(at, . . . , x n ) | ai, . . . , x n G A}. 

We call A i x • • • x A„ the Cartesian product of A\, . . . . A n . 

2.16. Lemma. (1) If A \, ... ,A n are all countable, so is their Cartesian product 
A\ x • • • x A n . 

(2) For every countable set A, each A" ( n > 2) and the union 
U 7=2 A " = {(*i . • • • . x n ) \n>2, Al — , x n G A} 
are all countable. 

Proof. ( 1 ) If some A , is empty, then the product is empty (by the definition) 
and hence countable. Otherwise, in the case of two sets A. B, we have some 
enumeration 

B = {b 0 ,b u ...} 

of B. obviously 

Ax B = [JZoiA x {*„}), 

and each A x {b n } is equinumerous with A (and hence countable) via the 
correspondence (a i— > (a ,b„)). This gives the result for n = 2. To prove the 
proposition for all n > 2, notice that 

Ai x ■ ■ ■ x A n x A n+ 1 = c (Ai x ■ ■ ■ x A n ) x A n+1 

via the bijection 

/(at,... , ^n, @n+i) = ((at. • • • > a w ), a n +i). 

Thus, if every product of n > 2 countable factors is countable, so is every 
product of n + 1 countable factors, and so (1) follows by induction. 

(2) Each A" is countable by (1), and then UZiA" is also countable by 
another appeal to Theorem 2.10. H 

2.17. Definition. A real number a is algebraic if it is a root of some polynomial 

P( a) = «o + «ix + • • • + a n x n 

with integer coefficients a 0 ,a n G Z (; n > 1 ,a n f 0), i.e., if 

P(a) = 0. 

Typical examples of algebraic numbers are \fl, (1 + \fl) 2 (why?) but also 
the real root of the equation a 5 + a + 1 =0 which exists (why?) but cannot be 
expressed in terms of radicals, by a classical theorem of Abel. The basic fact 
(from algebra) about algebraic numbers is that a polynomial of degree n > 1 
has at most n real roots', this is all we need for the next result. 

2.18. Corollary. The set K of algebraic real numbers is countable (Cantor) , and 
hence there exist real numbers which are not algebraic (Liouville). 
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Proof. The set n of all polynomials with integer coefficients is countable, 
because each such polynomial is determined by the sequence of its coefficients, 
so that II can be injected into the countable set (J ™ =2 Z " . For each polynomial 
P(x), the set of its roots 

A (P(x)) = {a | P(a) = 0} 

is finite and hence countable. It follows that the set of algebraic numbers K is 
the union of a sequence of countable sets and hence it is countable. H 

This first application of the (then) new theory of sets was instrumental 
in ensuring its quick and favorable acceptance by the mathematicians of the 
period, particularly since the earlier proof of Liouville (that there exist non- 
algebraic numbers) was quite intricate. Cantor showed something stronger, 
that “almost all” real numbers are not algebraic, and he did it with a much 
simpler proof which used just the fact that a polynomial of degree n cannot 
have more than n real roots, the completeness of R, and, of course, the new 
method of counting the members of infinite sets. 

So far we have shown the existence of only two “orders of infinity”, that of 
N — the countable, infinite sets — and that of R. There are many others. 

2.19. Definition. The powerset V{A) of a set A is the set of all its subsets, 

V(A) = {X | A is a set and X C A}. 

2.20. Exercise. For all sets A. B. 

A= c B^V(A)= c V(B). 

2.21. Theorem (Cantor). For every set A, 

A < c P(A), 

i.e.. A < c V(A) but A f c V(A)\ in fact there is no surjection n : A — » V(A). 
Proof. That A < c V(A ) follows from the fact that the function 

(x i ^ {x}) 

which associates with each member x of A its singleton {x} is an injection. 
(Careful here: the singleton {x} is a set with just the one member x and it is 
not the same object as x, which is probably not a set to begin with!) 

To complete the proof, we assume (towards a contradiction) that there 
exists a surjection 

n : A -» P(A), 

and we define the set 

B = {x € A \ x £ 7r(x)}. 

so that for every x € A, 

x € B x £ 7r(x) . (2-1) 

Now B is a subset of A and n is a surjection, so there must exist some b e A 
such that B = n(b); and setting x = b and n{b) = B in (2-1), we get 

b €B <==> b£B 
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which is absurd. H 

So there are many orders of infinity, and specifically (at least) those of the 
sets 


N < c V(N) < c V(T( N)) < e ■ ■ ■ . 


If we name these sets by the recursion 


T 0 = N, 

T, 2+i = V{T n ), 


( 2 - 2 ) 


then their union = (J ^L 0 T n has a larger cardinality than each T n , Problem 
x2.8. The classification and study of these orders of infinity is one of the central 
problems of set theory. 

Somewhat more general than powersets are function spaces. 

2.22. Definition. For any two sets A. B. 

(A — > B) = df {f\f:A—> B} 

= the set of all functions from A to B. 


2.23. Exercise. If A\ = c A 2 and B\ = c B 2 , then (A i — > B\) = c (A 2 —> Bf). 

Function spaces are “generalizations” of powersets because each subset 
X C A can be represented by its characteristic function cx '■ A — > {0, 1}, 


exit) 


l, if teAnx, , , 

0 , \U e A\x. 


(2-3) 


We can recover X from c x , 

X = {teA | exit) = 1}, 

and so the mapping (X i— > c x ) is a correspondence of V(A) with (A — > {0. 1}). 
Thus 

(A -> {0, 1}) = c V(A) > c A. (2-4) 

and the function space operation also leads to large, uncountable sets. The 
next obvious problem is to compare for size these uncountable sets, starting 
with the two simplest ones, P(N) and the set R of real numbers. 

2.24. Lemma. "P(N) < c R. 

Proof. It is enough to prove that 'P(N) < c A, since we have already shown 
that A < c R. This follows immediately from (2-4), as A = (N — > {0, 1}). H 
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A B 




Figure 2.5. Proof of the Schroder-Bernstein Theorem. 

2.25. Lemma. R < c T’(N). 

Proof. It is enough to show that R < c V{Q), since the set of rationals Q is 
equinumerous with N and hence "P(N) = c V{Q). This follows from the fact 
that the function 


x i-> n(x) = {q G Q | q < x} C Q 

is an injection, because if x < y are distinct real numbers, then there exists 
some rational q between them, x < q < y and q € n(y) \ n{x). H 

With these two simple Lemmas, the equinumerosity R = c V(N) will follow 
immediately from the following basic theorem. 

2.26. Theorem (Schroder-Bernstein). For any two sets A, B, 
if A < c B and B < c A, then A = c B. 

Proof. 4 We assume that there exist injections 

/ : A >-► B, g : B >-► A, 

and we define the sets A„, B„ by the following recursive definitions: 

Aq = A, Bo = B, 

A n +i = gf[A„\, B n+ 1 = fg[B n \. 


4 A different proof of this theorem is outlined in Problems x4.26, x4.27. 
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where fg[X] = {/(g(x)) | x G X} and correspondingly for the function gf. 
By induction on n (easily) 

A n G) g[B H ] ' A n + l, 

B n 2 f[A n ] D B„+i, 

so that we have the “chains of inclusions” 

d 0 2g[5o]3^ 2g[Bi]2A 2 --- , 

Bo 2 /Mo] 7 5i 2 f[A\\ 2 B 2 - - . 

We also define the intersections 

a* = n„°v«> 5* = nzA,- 

so that 

b * = nr=o^ 2 n„°°=o/M«] 2 nr= 0 ^ + i = * * 

and since / is an injection, by Problem xl.7, 

f[A*] = f[f)Zo A n] = n“o f\M = B*. 

Thus / is a bijection of A* with B*. On the other hand, 

A=A*U(A 0 \ g[B 0 ]) U (g[£ 0 ] \ A x ) U Mi \ g[ih]) U {g[Bi] \ A 2 ) . . . 

B = B* U (Bo \ /Mo]) U (/Mo] \ B\) U {B\ \ /Mi]) U (/Mi] \ B 2 ) . . . 

and these sequences are separated, i.e., no set in them has any common element 
with any other. To finish the proof it is enough to check that for every n, 

f[A n \g[B n ]]=f[A n ]\B n+u 
g[B„ \ f[A n ]\ = g[B „ ] \ A n+U 

from which the first (for example) is true because / is an injection and so 
f[A n \ g[B„]] = f[A,\ \ f g[B n \ = f[A,\ \ B n+ ] . 

Finally we have the bijection n : A >— » B, 

( \ = / fix), if x G A* or (3 n)[x G A n \ g[B„]], 

W 1 g~ l (x), if a- i A* and (3 n)[x G g[B„] \ A n+l ], 

which verifies that A = c B and finishes the proof. 3 

Using the Schrdder-Bernstein Theorem we can establish easily several equinu- 
merosities which are quite difficult to prove directly. 

Problems for Chapter 2 

x2.1. For any a < / where a , / are reals, oo or — oo, construct bijections 
which prove the equinumerosities 

(a,0) = c (0.1) = c R. 

*x2.2. For any two real numbers a < fi. construct a bijection which proves the 
equinumerosity 

[aJ)= c [aJ]= c R. 
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x2.3. "P(N) = c R = c R". for every n > 2. 

x2.4. For any two sets A. B, (A —> B) < c V(A x B). Hint. Represent each 
/ : A — > B by its graph, the set 

Gf = {(x,y) &AxB\y = 
x2.5. (N -» N) = c P(N). 

*x2.6. (N -► K) = c K. 

*x2.7. For any three sets A. B. C , 

((AxB)->C)= c (A->(B->C)). 

Hint. For any p : A x B —> C, define n{p) = q : A — > {B — > C) by the 
formula 

q{x){y) = p(x, y). 

x2.8. Using the definition (2-2). for every m, 

T m < c = LCo 7 ’"- 

You need to know something about continuous functions to do the last two 
problems. 

*x2.9. The set C[0, 1] of all continuous, real functions on the closed interval 
[0, 1] is equinumerous with R. 

*x2.10. The set of all monotone real functions on the closed interval [0. 1] is 
equinumerous with R. 



CHAPTER 3 


PARADOXES AND AXIOMS 


In the preceding chapter we gave a brief exposition of the first, basic results 
of set theory, as it was created by Cantor and the pioneers who followed him 
in the last twenty five years of the 19th century. By the beginning of the 20th 
century, the theory had matured and justified itself with diverse and significant 
applications, especially in mathematical analysis. Perhaps its greatest success 
was the creation of an exceptionally beautiful and useful transfinite arithmetic , 
which introduces and studies the operations of addition, multiplication and 
exponentiation on infinite numbers. By 1900, there were still two fundamental 
problems about equinumerosity which remained unsolved. These have played 
a decisive role in the subsequent development of set theory and we will consider 
them carefully in the following chapters. Here we just state them, in the form 
of hypotheses. 

3.1. Cardinal Comparability Hypothesis. 5 For any two sets A, B , either A < c B 
or B < c A. 

3.2. Continuum Hypothesis. There is no set of real numbers X with cardinality 
intermediate between those of N andM., i.e., 

(CH) (VICK)[I< c NVl= f R], 

Since R = c V{N), CH is a special case of the Generalized Continuum Hypoth- 
esis, the statement that for every infinite set A, 

(GCH) (VX C V{A))[X < c A V X = c V{A)}. 

If both of these hypotheses are true, then the natural numbers N and the reals 
R represent the two smallest “orders of infinity”: every set is either countable, 
or equinumerous with R, or strictly greater than R in cardinality. 

In this beginning “naive” phase, set theory was developed on the basis of 
Cantor’s definition of sets quoted in Chapter 1, much as we proved its basic 
results in Chapter 2. If we analyze carefully the proofs of those results, we 


5 Cantor announced the “theorem of comparability of cardinals" in 1895 and in 1899 he 
outlined a proposed proof of it in a letter to Dedekind, which was not, however, published until 
1932. There were problems with that argument and it is probably closer to the truth to say that 
until 1900 (at least) the question of comparability of cardinals was still open. 
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will see that they are all based on the Extensionality Property (1-1) and the 
following simple assumption. 

3.3. General Comprehension Principle. For each «-ary definite condition P, 
there is a set 

A = {x | P{x)} 

whose members are precisely all the «-tuples of objects which satisfy P{x), so 
that for all x, 

x € A P(x). (3-1) 

The extensionality principle implies that at most one set A can satisfy (3-1), 
and we call this A the extension of the condition P. 

3.4. Definite conditions and operations. It is necessary to restrict the compre- 
hension principle to definite conditions to avoid questions of vagueness which 
have nothing to do with science. We do not want to admit the “set” 

A =df {x | x is an honest politician}, 

because membership of some specific public figure in it may be a hotly debated 

topic. An n-ary condition P is definite if for each n-tuple x = {x\ x n ) of 

objects, it is determined unambiguously whether P(x) is true or false. For 
example, the binary conditions P and S defined by 

P(x,y) x is a parent of y, 

S{s,t) <^=>df s and t are siblings 

•<=>• (3x)[/*(x, s)&P(x, t)] 

are both definite, assuming (for the example) that the laws of biology de- 
termine parenthood unambiguously. The General Comprehension Principle 
applies to them and we can form the sets of pairs 

A =df {(x,y) | x is a parent of y}, 

B =df {(j, t) | s and t are siblings}. 

We do not demand of a definite condition that its truth value be effectively 
determined. For example, it is a famous open problem of number theory 
whether there exist infinitely many pairs of successive, odd primes, and the 
truth or falsity of the condition 

G(n) **=>df n e N& {3m > n)[m, m + 2 are both prime numbers] 

is not known for sufficiently large n. Still the condition G is unambiguous 
and we can use it to form the set of numbers 

C =df {« € N | {3m > n)[m, m + 2 are both prime numbers]}. 

The twin primes conjecture asserts that C = N, but if it is false, then C is 
some large, initial segment of the natural numbers. 
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In the same way, an n-ary operation F is definite if it assigns to each n-tuple of 
objects x a unique, unambiguously determined object w = F(x). For example, 
assuming again that biology will not betray us, the operation 

w x f the father of x, if x is a human, 

F[X) =df \ x, otherwise, 

is definite. The silly consideration of cases here was put in to ensure that F 
determines a value for each argument x. In practice, we would define this 
operation by the simpler 

Fix) = df the father of x, 

leaving it to the reader to supply some conventional, irrelevant value F{x) for 
non-human x’s. Again, definite operations need not be effectively computable , 
in fact the determination of the value F{x) is sometimes the subject of judicial 
conflict in this specific case. 

In addition to the General Comprehension Principle, we also assumed in 
the preceding chapter the existence of some specific sets, including the sets 
N and R of natural and real numbers, as well as the definiteness of some 
basic conditions from classical mathematics, e.g., the condition of “being a 
function”, 

Function(/, A, B) / is a function from A to B. 

This poses no problem as mathematicians have always made these assump- 
tions, explicitly or implicitly. 

The General Comprehension Principle has such strong intuitive appeal that 
the next theorem is called a “paradox”. 

3.5. Russell’s paradox. The General Comprehension Principle is not valid. 

Proof. Notice first that if the General Comprehension Principle holds, 
then the set of all sets 

V =df {x | x is a set} 

exists, and it has the peculiar property that it belongs to itself, V £ V. The 
common sets of everyday mathematics — sets of numbers, functions, etc. — 
surely do not belong to themselves, and so it is natural to consider them as 
members of a smaller, more natural universe of sets, by applying the General 
Comprehension Principle again, 

R = {.x | x is a set and x ^ x}. 

From the definition of R, however, 

R £ R <= => R <f R, 

which is absurd. H 

When it is more than just a mistake, a “paradox” is simply a fact which 
runs counter to our intuitions, and set theorists already knew several such 
“paradoxes” before Russell announced this one in 1902, in a historic letter to 
the leading German philosopher and founder of mathematical logic Gottlob 
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Frege. These other paradoxes, however, were technical and affected only 
some of the most advanced parts of Cantor’s theory. One could imagine 
that higher set theory had a systematic error built in, something like allowing 
a careless “division by 0” which would soon be discovered and disallowed, 
and then everything would be fixed. After all. contradictions and paradoxes 
had plagued the “infinitesimal calculus” of Newton and Leibnitz and they 
all went away after the rigorous foundation of the theory which was just 
being completed in the 1890s, without affecting the vital parts of the subject. 
Russell’s paradox, however, was something else again: simple and brief, it 
affected directly the fundamental notion of set and the “obvious” principle of 
comprehension on which set theory had been built. It is not an exaggeration 
to say that Russell’s paradox brought a foundational crisis of doubt, first to 
set theory and through it. later, to all of mathematics, which took over thirty 
years to overcome. 

Some, like the French geometer Poincare and the Dutch topologist and 
philosopher Brouwer, proposed radical solutions which essentially dismissed 
set theory (and much of classical mathematics along with it) as “pseudothe- 
ories”, without objective content. From those who were reluctant to leave 
“Cantor’s paradise”, Russell first attempted to “rescue” set theory with his 
famous theory of types, which, however, is awkward to apply and was not 
accepted by a majority of mathematicians. 6 At approximately the same time, 
Zermelo proposed an alternative solution, which in time and with the contri- 
butions of many evolved into the contemporary theory of sets. 

In his first publication on the subject in 1908. Zermelo took a pragmatic 
view of the problem. No doubt the General Comprehension Principle was 
not generally valid, Russell’s paradox had made that clear. On the other hand, 
the specific applications of this principle in the proofs of basic facts about sets 
(like those in Chapter 2) are few, simple and seemingly non-contradictory. 

Under such circumstances there is at this point nothing left for us to 
do but to proceed in the opposite direction [from that of the General 
Comprehension Principle] and. starting from set theory as it is his- 
torically given, to seek out the principles required for establishing the 
foundations of this mathematical discipline. In solving the problem 
we must, on the one hand, restrict these principles sufficiently to ex- 
clude all contradictions and. on the other, take them sufficiently wide 
to retain all that is valuable in this theory. 

In other words, Zermelo proposed to replace the direct intuitions of Cantor 
about sets which led us to the faulty General Comprehension Principle with 
some axioms, hypotheses about sets which we accept with little a priori justi- 
fication, simply because they are necessary for the proofs of the fundamental 
results of the existing theory and seemingly free of contradiction. 


6 The theory of types has had a strong influence in the development of analytic philosophy and 
logic, and some of its basic ideas eventually have also found their place in set theory. 
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Such were the philosophically dubious beginnings of axiomatic set theory, 
surely one of the most significant achievements of 20th century science. From 
its inception, however, the new theory had a substantial advantage in the ge- 
nius of Zermelo, who selected an extraordinarily natural and pliable axiomatic 
system. None of Zermelo’s axioms has yet been discarded or seriously revised 
and (until very recently) only one basic new axiom was added to his seven 
in the decade 1920-1930. In addition, despite the opportunistic tone of the 
cited quotation, each of Zermelo’s axioms expresses a property of sets which is 
intuitively obvious and was already well understood from its uses in classical 
mathematics. With the experience gained from working out the consequences 
of these axioms over the years, a new intuitive notion of “grounded set” has 
been created which does not lead to contradictions and for which the axioms 
of set theory are clearly true. We will reconsider the problem of foundation 
of set theory after we gain experience by the study of its basic mathematical 
results. 

The basic model for the axiomatization of set theory was Euclidean geom- 
etry. which for 2000 years had been considered the “perfect” example of a 
rigorous, mathematical theory. If nothing else, the axiomatic method clears 
the waters and makes it possible to separate what might be confusing and 
self-contradictory in our intuitions about the objects we are studying, from 
simple errors in logic we might be making in our proofs. As we proceed in our 
study of axiomatic set theory, it will be useful to remind ourselves occasionally 
of the example of Euclidean geometry. 

3.6. The axiomatic setup. We assume at the outset that there is a domain or 
universe W of objects, some of which are sets, and certain definite conditions 

and operations on W, among them the basic conditions of identity , sethood 
and membership : 

x = y •£=> x is the same object as y, 

Set(x) •£=>■ x is a set. 

x € y -£=>■ Set(y) and x is a member of y. 

We call the objects in W which are not sets atoms, but we do not require that 
any atoms exist, i.e., we allow the possibility that all objects are sets. Definite 
conditions and operations are neither sets nor atoms. 

This is the way every axiomatic theory begins. In Euclidean geometry 
for example, we start with the assumption that there are points , lines and 
several other geometrical objects and that some basic, definite conditions and 
operations are specified on them, e.g., it makes sense to ask if a “point P lies 
on the line L”, or “to construct a line joining two given points”. We then 
proceed to formulate the classical axioms of Euclid about these objects and to 
derive theorems from them. Actually Euclidean geometry is quite complex: 
there are several types of basic objects and a long list of intricate axioms about 
them. By contrast. Zermelo’s set theory is quite austere: we just have sets and 
atoms and only seven fairly simple axioms relating them. In the remainder of 
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this chapter we will introduce six of these axioms with a few comments and 
examples. It is a bit easier to put off stating his last, seventh axiom until we 
first gain some understanding of the consequences of the first six in the next 
few chapters. 

3.7. (I) Axiom of Extensionality. For any two sets A, B, 

A = B (Vx)[x G A -<==>■ x G B], 

3.8. (II) Emptyset and Pairset Axiom, (a) There is a special object 0 which we 
will call a set, but which has no members. ( b ) For any two objects x, y, there is 
a set A whose only members are x and y, so that it satisfies the equivalence 

t G A <==> t = x V t = y. (3-2) 

The Axiom of Extensionality implies that only one empty set exists, and that 
for any two objects x, y, only one set A can satisfy (3-2). We denote this 

doubleton of x and y by 

{x, y} =df the unique set A with sole members x, y. 

If x = y, then {x, x} = {x} is the singleton of the object x. 

Using this axiom we can construct many simple sets, e.g., 

0. {0}. {{0}}. {0, {0} } , {{0}.{{0}» 

but each of them has at most two members! 

3.9. Exercise. Prove that 0 f {0}. 

3.10. (Ill) Separation Axiom or Axiom of Subsets. For each set A and each 
unary, definite condition P, there exists a set B which satisfies the equivalence 

xGB •£=>■ x € A&P(x). (3-3) 

From the Extensionality Axiom again, it follows that only one B can satisfy 
(3-3) and we will denote it by 

B = {x G A | P(x)}. 

A characteristic contribution of Zermelo, this axiom is obviously a re- 
striction of the General Comprehension Principle which implies many of its 
trouble-free consequences. For example, we can use it to define the operations 
of intersection and difference on sets, 

An B =df {x € A | x € Bj, 

A\B =df {x G A | x ^ B}. 

The proof of Russell’s paradox yields a theorem: 

3.11. Theorem. For each set A, the set 

r(A ) = d f {x G A | x ^ x} (3-4) 

is not a member of A. It follows that the collection of cdl sets is not a set, i.e., 
there is no set V such that 


x G V 


Set(x). 
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Proof. Notice first that r(A) is a set by the Separation Axiom. Assuming 
that r(A) G A, we have (as before) the equivalence 

r(A) £ r(A) r(A ) i r(A), 

which is absurd. 3 

3 . 12 . (IV) Powerset Axiom. For each set A , there exists a set B whose members 
are the subsets of A, i.e., 

X G B <=> Set(A)&A C A. (3-5) 

Here X C A is an abbreviation of (Vf)[t G X => t G A], The Axiom of 
Extensionality implies that for each A, only one set B can satisfy (3-5); we 
call it the powerset of A and we denote it by 

V(A) =df {X | Set(A) & A C A}. 

3 . 13 . Exercise. Vfb) = {0} andV{{®}) = {0. {0}}. 

3 . 14 . Exercise. For each set A, there exists a set B whose members are exactly 
all singletons of members of A, i.e., 

x G B •<=> (3 1 G A)[x = {/}]. 

3 . 15 . (V) Unionset Axiom. For each set If, there exists a set B whose members 
are the members of the members of<o, i.e., so that it satisfies the equivalence 

t£B (3IG %)[t G X], (3-6) 

The Axiom of Extensionality implies again that for each If, only one set can 
satisfy (3-6); we call it the unionset of If and we denote it by 

U^=df{t I (3A £%)[t G X]}. 

The unionset operation is obviously most useful when If is a family of sets, 
i.e., a set all of whose members are also sets. This is the case for the simplest 
application, which (finally) gives us the binary, union operation on sets: we 
set 

AU B = \J{A.B} 

using axioms (II) and (V), and we compute 

t G A U B ^ (3 X £ {A. B})[t £ X] 
t G AV t G B. 

3 . 16 . Exercise. |J 0 = 1J { 0 } = 0 - 

3 . 17 . (VI) Axiom of Infinity. There exists a set I which contains the empty set 
0 and the singleton of each of its members, i.e., 

0 G / & (Vx)[.v G / => {-v} G /]. 

We have not given yet a rigorous definition of “infinite”, but it is quite 
obvious that any / with the properties in the axiom must be infinite, since 
(VI) implies 


0 G /. {0} G I, {{0}} G /. . . . 
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and the objects 0, {0}, ... are all distinct sets by the Extensionality Axiom. 
The intuitive understanding of the axiom is that it demands precisely the 
existence of the set 

/ = { 0 ,{ 0 },{{ 0 }},...}. 

but it is simpler (and sufficient) to assume of / only the stated properties, 
which imply that it contains all these complex singletons. 

It was a commonplace belief among philosophers and mathematicians of 
the 19th century that the existence of infinite sets could be proved, and in 
particular the set of natural numbers could be “constructed” out of thin air, 
“by logic alone”. All the proposed “proofs” involved the faulty General Com- 
prehension Principle in some form or another. We know better now: logic 
can codify the valid forms of reasoning but it cannot prove the existence of 
anything, let alone infinite sets. By taking account of this fact cleanly and 
explicitly in the formulation of his axioms, Zermelo made a substantial contri- 
bution to the process of purging logic of ontological concerns and separating 
the mathematical development of the theory of sets from logic, to the benefit 
of both disciplines. 

3.18. Axioms for definite conditions and operations. Zermelo understood def- 
inite conditions intuitively, he described them much as we did in 3.4 and he 
applied the Separation Axiom using various quite complex conditions without 
any special argument that they are, indeed, “definite”. We will do the same, 
because the business of proving definiteness is boring and not particularly 
illuminating. For the sake of completeness, however, we list here the only 
properties of definiteness that we will actually use. 

1 . The following basic conditions are definite: 

x = y ^=4>df v and y are the same object, 

Set(x) -<=>df v is a set. 
x e y -£=>df Set(.v) and x is a member of y, 

2. For each object c and each n, the constant «-ary operation 

F(x i, . . . , x„) = c 

is definite. 

3. Each projection operation 

Fj(x\, , . . , x„) = Xi (1 < i < n) 

is definite. 

4. If P is a definite condition of n + 1 arguments and for each tuple of 

objects x = x\ ,x n there exists exactly one w such that P(x,w) is 

true, then the operation 

F(x) = the unique w such that P(x, w) 


is definite. 
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5. If Q is an m - ary definite condition, each P, is an /7-ary definite operation 
for / = 1 , . . . , m and 

P(x) 4=>df Q(Fi(x), ... , F m (x)), 

then the condition P is also definite. 

6. If Q. R and S are definite conditions of the appropriate number of 
arguments, then so are the following conditions which are obtained 
from them by applying the elementary operations of logic: 

Pi (a) -'P(a) •<=>• P(x) is false, 

Pi{x) 4=»df Q(x)&R(x) <=> both Q(x) and R(x) are true. 

P3 (x) <=>df Q(x ) V R(x) <==$■ either Q(x) or R{x) is true, 

P 4 {x) 4=»df (3y)S(x,y) for some y, S(x, y) is true, 

P 5 (x) <t=>df (Vy)S(x,y) ■<=>• for every y, S (x,y) is true. 

All the conditions and operations we will use can be proved definite by ap- 
pealing to these basic properties. Aside from one problem at the end of this 
chapter, however, for the logically minded, we will omit these technical proofs 
of definiteness and it is best for the reader to forget about them too: they 
detract from the business at hand, which is the study of sets, not definite 
conditions and operations. 

3.19. Classes. Having gone to all the trouble to discredit the General Principle 
of Comprehension, we will now profess that for every unary, definite condition 
P there exists a class 

A = {x | P(x)}, (3-7) 

such that for every object x, 

x € A P(x). (3-8) 

To give meaning to this principle and prove it, we need to define classes. 
Every set will be a class, but because of the Russell Paradox 3.5, there must be 
classes which are not sets, else (3-8) leads immediately to the Russell Paradox 
in the case P(x) <=> Set(x) & x ^ x. 

First let us agree that for every unary, definite condition P we will write 
synonymously 

x £ P ■<=>■ P(x). 

For example, if Set is the basic condition of sethood, we write interchangeably 

x G Set <£=> Set(x) •<=>■ x is a set. 

This is just a convenient notational convention. 

A unary definite condition P is coextensive with a set A if the objects which 
satisfy it are precisely the members of A, in symbols 

P = e A 4=>df (Vx)[P(x) x G A], (3-9) 

For example, if 

P(x) X X, 
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then P = e 0. By the Russell Paradox 3.5, not every P is coextensive with a set. 
On the other hand, a unary, definite condition P is coextensive with at most one 
set ; because if P = e A and also P = e B, then for every xj 

x € A •£=> P(x) x G B, 

and A = B by the Axiom of Extensionality. 

By definition, a class is either a set or a unary definite condition which is not 
coextensive with a set. With each unary condition P. we associate the class 

{ the unique set A such that P = e A, 

if P = e A for some set A, (3-10) 

P. otherwise. 

Now if A =df {x | P(x)}, then either P is coextensive with a set, in which 
case P = e A and by the definition x G A •<=>• P(x); or P is not coextensive 
with any set, in which case A = P and 

x£A •<==>• x G P P(x) (by the notational convention) . 

This is exactly the General Comprehension Principle for Classes enunciated 
in (3-7) and (3-8). 

3.20. Exercise. For every set A, 

{x | x G A} = A. 

and , in particular, every set is a class. Show also that 

{X | Set(W) &X C A} = V(A). 

3.21. Exercise. The class W of all objects is not a set, and neither is the class 
Set of all sets. 

If P is an /;-ary definite condition and F an /7-ary definite operation, we set 
{E(x) | P(x)} = df {w | (3x)[P(x)& W = E(x)]}. (3-11) 

For example, with F(x) = {x}, 

{{x} | x = x} = {«; | (3x)[u/ = {x}]} = the class of all singletons. 

3.22. Exercise. The class {{x} | x = x} of all singletons is not a set. 

3.23. Exercise. For every class A , 

A is a set for some class B, A G B 
for some set X, A C X, 
where inclusion among classes is defined as if they were sets, 

A C B <t=4>df (Vx)[x G A => x G B], 

3.24. The Axioms of Choice and Replacement: a warning. Our axiomatization 
of set theory will not be complete until we introduce Zermelo’s last Axiom 
of Choice in Chapter 8 and the later Axiom of Replacement in Chapter 11. 
While there are good reasons for these postponements which we will explain 
in due course, there are also good reasons for adding the axioms of Choice 
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and Replacement: many basic set theoretic arguments need them, and among 
these are some of the simplest claims of Chapter 2. Thus, until Chapter 8, 
we will need to be extra careful and make sure that our constructions indeed 
can be justified by axioms (I) - (VI) and that we have not sneaked in some 
“obvious” assertion about sets not yet proved or assumed. In a few places we 
will formulate and prove something weaker than the whole truth, when the 
proof of the whole truth needs one of the missing axioms. Now this is good: 
it will keep us on our toes and make us understand better the art of reasoning 
from axioms. 

3.25. About atoms. Most recent developments of axiomatic set theory assume 
at the outset the so-called Principle of Purity, that there are no atoms : all 
objects of the basic domain are sets. There is a certain appealing simplicity to 
this conception of a mathematical world in which everything is a set. We have 
followed Zermelo in allowing atoms (without demanding them), primarily 
because this makes the theory more naturally relevant to the natural sciences: 
we want our results to apply to sets of planets, molecules or frogs, and frogs 
are not sets. In any case, it comes at little cost, we simply have to say “object” 
in some situations where the atom banners would say “set”. It is important to 
notice, however, that none of the axioms requires the existence of atoms, so 
none of the consequences we will derive from them depends on the existence 
of atoms: everything we will prove remains true in the domain of pure sets, 
provided only that it satisfies the Zermelo axioms, as we stated them. 

3.26. Axioms as closure properties of the universe W. Whatever the domain 
W of our axiomatic set theory may be. it is clear that it does not contain all 
“objects of our intuition or thought” in Cantor’s expression: W is not a set. 
and it is certainly a perfectly legitimate mathematical object of our intuition 
about which we intend to have many thoughts. Granting that VV is not all 
there is, we can fruitfully conceive of the axioms as imposing closure conditions 
on it. We have assumed (so far) that >V contains 0, that it is closed under 
the operations of pairing {x, y} (II). powerset V{X) (IV), and unionset |Jlf 
(V), that it includes every definite subcollection of every set (III), and that it 
contains some set / with the stipulated property of the Axiom of Infinity ( VI) . 
It is also possible to understand the Axiom of Extensionality (I) as a closure 
property of W: in its non-trivial direction, it says that for any two sets A, B. 

AfB^{3t)[t&U\B)VJ{B\A)}, (3-12) 

i.e., every inequality A f B between two sets is witnessed by some legitimate 
object GW which belongs to one and not to the other. 

This understanding of the meaning of the axioms is compatible with two 
different conceptions of the universe VV. One is that it is huge, amorphous, 
difficult to understand and impossible to define; but every object in it is 
concrete, definite, whole, and this is enough to justify the closure properties 
of W embodied by the axioms. Let us call this the large view. The small view 
is that W consists precisely of those objects whose existence is “guaranteed” 
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by the axioms, those which can be “constructed” by applying the axioms 
repeatedly: the axioms are satisfied because we deliberately put in W all the 
objects required by the closure properties they express. Neither conception 
is precise, to be sure, but they are different. On the small view, for example, 
there are no atoms since none of the axioms demands their existence, while 
the large view allows frogs among the objects in W. 

Both of these views can be defended and they have played significant roles 
in the philosophy of set theory, and even in its mathematical practice, by 
suggesting the kind of questions one should ask. We will come back to 
discuss the issue in Chapter 11 and Appendix B, when we will be in a position 
to be less flippant about it. In the meantime, we will often speak of the axioms 
as closure conditions on W, a useful heuristic device which is compatible with 
every philosophical approach to the subject. 


Problems for Chapter 3 


x3.1. For each non-empty set isP and each A G ef, we define the intersection 
of<S via X by 

fj A .g- = df { x e x \ (vc g ir)[x g [/]}. 

Show that for any two members X, Y of If, 

n** = riy*. 

i.e., the intersection fj is independent of the specific X we used in its 
definition, and hence we can use for it the notation fjs? which does not 
exhibit X. Show also that AC\ B = [){A, B). 

x3.2. For any two sets A, B, determine whether each of the following classes 
is or is not a set. 

1. {{0, x} j x G A}. 

2. {x | Set(x)&x f 0}. 

3. {{x, y} | x € A&y € Bj. 

4. {V{X) | AC A}. 


x3.3. Prove rigorously that the following conditions and operations are defi- 
nite. using only (I) - (VI) and the axioms in 3.18. (Flere c is some arbitrary 
object.) 


Pi(x) C=4>df x G c, 
Pi{x,y, z) 2 G x, 

Pl(X, Y ) ^ d f XCY 

p 4 (x, y) -t=> d f mr = 0, 

P S {X, Y) <t=*df V(X) C Y. 


F\{x,y) =df {x,y}, 
Fi{X) =df 

FAX) =df Fix), 

F 4 (x) =df {x}, 
Fs(X, Y) =df A U Y 


In the remaining problems of this section we consider Zermelo’s notion of 
equivalence , which intuitively holds between two sets when they are equinu- 
merous. These problems will be trivial after we introduce functions within 



Chapter 3. Paradoxes and axioms 


31 



Figure 3.1. The hypothesis for Part 3 of Problem x3.5. 

Zermelo’s axiomatic theory in the next Chapter, but right now they are chal- 
lenging. 

3.27. Definition. Recall that two sets A, B are disjoint if A n B = 0. A set W 
is a connection of the two disjoint sets A and B (according to Zermelo) if the 
following three conditions hold: 

1. Z G W=>{ 3x € A, y £ B)[Z = {x,y}]. 

2. For each x £ A, there is exactly one y £ B such that {x, y} £ W . 

3. For each y £ B, there is exactly one x £ A such that {x, y} £ W . 

x3.4. For any two disjoint sets A, B, the class 

E(A, B) = { W | IF is a connection of A with B } 

is a set. 

3.28. Definition. Two sets A. B are equivalent according to Zermelo if there 
exists a third set C disjoint from both of them and connections of A with C 
and of B with C, in symbols 

A ~ z B (3C, W, W')[A nC = 0 & B n C = 0 

& W £ S(A. C) & W’ £ JL(B. C)]. 

*x3.5. The condition of equivalence according to Zermelo has the following 
properties, for any three sets A, B, C: 

A~ Z A, 

if A ~z B. then B A, 
if (A ~ z B&B C), then A ~ z C. 

Hint. To prove A A , you need to find some set C such that A n C = 0 and 
there is a connection W of A with C. The hypothesis for the (last) transitivity 
property is illustrated in Figure 3.1. 
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CHAPTER 4 


ARE SETS ALL THERE IS? 


Our next goal is to determine whether the basic results of naive set theory 
in Chapter 2 can be proved on the basis of the axioms of Zermelo. Right 
at the start we hit a snag: to define the crucial notion of equinumerosity we 
need functions; to define countable sets we need the specific set N of natural 
numbers; the fundamental theorem 2.21 of Cantor is about the set R. of real 
numbers, etc. Put another way, the results of Chapter 2 are not only about 
sets, but about points, numbers, functions, Cartesian products and many 
other mathematical objects which are plainly not sets. Where will we find 
these objects in the axioms of Zermelo which speak only about sets? 

An obvious solution is to assume that these non-sets are among the atoms 
which are allowed by Zermelo’s theory and to add axioms which express our 
basic intuitions about points, numbers, functions, etc. This is possible but 
awkward and there is a much better solution. 

A typical example of the method we will adopt is the “identification” of the 
(directed) geometric line n with the set R of real numbers, via the correspon- 
dence which “identifies” each point P on the line with its coordinate x(P) 
with respect to a fixed choice of an origin O. What is the precise meaning 
of this “identification”? Certainly not that points are real numbers. Men have 
always had direct geometric intuitions about points which have nothing to do 
with their coordinates and which existed before Descartes discovered analytic 
geometry. Every Athenian of the classical period understood the meaning of 
the sentence 

Phaliron is between Piraeus and Sounion along the Saronic coast 7 

even though he was (by necessity) ignorant of analytic geometry. In fact, 
many educated ancient Athenians had an excellent understanding of the 
Pythagorean Theorem, without knowing how to coordinetize the plane. What 
we mean by the “identification” of n with R is that the correspondence 
P i— > x(P) gives a faithful representation of n in R which allows us to give 
arithmetic definitions for all the useful geometric notions and to study the 
mathematical properties of n as if points were real numbers. For example, the 


7 These are seaside suburbs of Athens. 
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quoted sentence above is expressed by the inequalities 

x (Piraeus) < x(Phaliron) < x(Sounion), 

assuming that the coordinates increase in the easterly direction. In the same 
way, we will discover within the universe of sets faithful representations of all 
the mathematical objects we need, and we will study set theory on the basis 
of the lean axiomatic system of Zermelo as if all mathematical objects were 
sets. The delicate problem in specific cases is to formulate precisely the correct 
definition of a “faithful representation” and to prove that one such exists. 

4.1. Ordered pair. We consider first the basic (ordered) pair operation. Intu- 
itively, the pair (x, y) of two objects x and y is the “thing” which has a “first 
member” x and a “second member” y, and it is different from the unordered 
pair {x, y} since (for example) {0, 1} = {1,0} while (0, 1) f (1,0). Thus, the 
first characteristic property of the ordered pair is the following: 

(OP1) (x,y) = (x',y') x = x'&y = y'. 

There is a second, perhaps less obvious characteristic property of pairs which 
makes it possible to define Cartesian products: for any two sets A, B, 

(OP2) the class Ax B = df {(x, y) |xG^4&yG.6}isa set. 

Thus, the problem of representing the notion of “pair” in set theory takes 
the following precise form: we must define a definite operation (x, y) such 
that (OP1) and (OP2) follow from the axioms of Zermelo. 

4.2. Lemma. The Kuratowski pair operation 

(x, y) =df {{*}- {*> y}} (4-1) 

has properties (OP1) and (OP2). 

Proof. (OP1). The direction 4= is obvious. For the non-trivial direction 
=> , we distinguish two cases. 

If x = y, then {x, y} = {x, x} = {x}, the set (x, y) = {{x}, {x}} = {{x}} 
is a singleton, hence the set (x',y r ) which is assumed equal to it is also a 
singleton, so that x' = y' and (x',y') = {{x'}}; and since this last singleton 
is equal to {{x}}, we have x = x' and, hence, also y = x = x' = y'. 

If x f y , then the members of (x, y) are the singleton {x} and the doubleton 
{x, y}, and these must correspond with the members {x'j and (x', y'} of the 
equal set (x',y'), so that we must have {x} = {x'}, {x, y} = {x',y'}, and 
then, immediately, x = x' and y = y'. 

(OP2). It is enough to prove that for any two sets A. B, there is a set C such 
that 

x G A&y G B => {{x}, {x, y }} G C; (4-2) 

because then 


AxB = {z ec | (3x G A) (By G B)[z = (x,y)]}. 
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and A x B is a set by the Subset Axiom (III). To prove (4-2), we compute: 

x € A. y e B => {x}, {x, y} C (A U B) 

=> {x}, {x,y} e V(A U B) 

=► {{*}■ {a, j}} CPU US) 

=► (x,y) GP(PUUP)), 


so that we can take C = V(V(A U 5)). 3 

We now fix a specific definite operation (x,y) which satisfies (OP1) and 
(OP2), perhaps the Kuratowski pair defined in the proof of 4.2, perhaps some 
other: from now on we may forget the specific definition chosen, the only 
thing that counts is that the pair operation satisfies (OP1) and (OP2). 

4.3. Exercise. Let 


First (z) = d f < 
Second(z) = d f 


Pair(z) <t=>df (3x)(3y)[z = (x,y)], 

the unique x such that (3 y)[z = (x, y)], if Pair(z), 
z, otherwise, 

f the unique y such that (3x)[z = (x, y)], if Pair(z), 
I z, otherwise. 


It follows that 


Pair(z) •<=>• z = (First(z), Second(z)). 

Using the ordered pair we can easily define triples, quadruples, etc., as well 
as the corresponding products, e.g., 

(x, y,z) = df (x, (y,z)), (4-3) 

(x, y, z, w) = d f (x, (y, z, to)) = (x, (y, (z, to))), (4-4) 

A x B x C =df A x (B x C), (4-5) 


etc. By this definition, a tuple of length n + 1 is a pair with second member a 
tuple of length n. 

4.4. Exercise. For all x, y, z, x', y' , z', 

(x, y, z) = (x',y',z') x = x'&y = y'&z = z'. 


4.5. Disjoint union. For each set A and a fixed object blue , we can think of the 
set of pairs ( blue ) x A as a “blue copy” of A, the act of replacing each a € A 
by the pair ( blue , a) being the set theoretic equivalent of painting a blue. We 
fix two such distinct “colors”, 

blue =df 0, white =df{0}> (4-6) 

and we define the disjoint union of two sets by 

Am B = d f ({ blue I x A) U ({ white! x B ). 

The construction is pictured in Figure 4.1, and the notion is useful, as we will 
see. It should be clear that the specific identity of blue and white must be 
deliberately and instantly forgotten: all that matters is that blue f white . 
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Figure 4.1. Constructing the disjoint union. 

4.6. Exercise. Show that for all sets A, B, ri l±) 0 C A W B. Assuming that we 
use the Kuratowski pair to define products, give an example where the plausible 
inclusion A C A l±! B is not true. 

Next we consider the notion of relation which permeates mathematics. 
Intuitively, a binary relation R between objects ,v € A and y £ B is a condition 
which is satisfied by some x £ A, y £ B and fails for others. For example, the 
relation 

x R y x is a son of y 

is defined on A = {men}, B = {women} and holds for x, y precisely if y has 
given birth to x. The obvious way to represent a binary relation in set theory 
is to identify it with its extension, the set of pairs which satisfy it. 

4.7. Relations. A binary relation on the sets A, B is any subset R of the 
Cartesian product A x B. We will use synonymously the notations 

R(x,y) xRy (x,y) € R. 

Obvious examples of binary relations are the identity and the relations of 
membership and subsethood restricted to some set A, 

x =a }’ ^=>df x £ A & y £ A & x = y, 
x £a Y *£=>df x £ A&Y C A &x £ Y. 

X £ a Y ^=4> d f X C Y C A, 

which by the definition are represented respectively by the sets 

=a =df {(x, y) £ A x A \ x = y}, 

£a =df {(x, Y)£Ax p(A) | x £ Y}, 

C A = df {(X Y) £ V(A) x p(A) ICf}. 

4.8. Relations and definite conditions. The definite conditions of identity =, 
membership € and subsethood C on the domain W of all objects are not 
relations according to 4.7, in fact they are not even “coextensive” with sets 
of pairs, because their extensions are “too large”. This is important, the 
distinction between relations and definite conditions: briefly, every relation 
determines a definite condition but (in general) the converse does not hold. The 
precise situation is detailed in the next Exercise. In practice we will often refer 
to the relation = on the set A. meaning (without possibility of confusion) the 
restriction = A as we just defined it. 
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4.9. Exercise. (1) For every binary relation R C (A x B), the condition 

R*{x,y) ^=»df ( x,y ) G R 

is definite. (Nothing to prove here, unless you want to practice applying 3.18.) 
(2) For every binary definite condition P and any two sets A. B. the restriction 

Pa.b =df {(x,y) € A x B j d(x,y)} 
of P to A x B is a binary relation on A, B. 

4.10. Properties of relations on a set A. Binary relations with both arguments 
ranging over the same set are especially important, and they are classified 
and studied according to the structural properties they may enjoy. Here are 
three such properties which come up often, in various combinations, where 
P C A x A: 

P is reflexive <t=>df (Vx G A)[xPx], 
dissymmetric x=> d f (Vx , y G A)[xPy =>ydx], 

P is transitive <t=>df (Vx,y,z G A)[[xPy8c yd:]=>xd:]. 

We call P an equivalence relation on A if it has all three of these properties. 

Equivalence relations are very useful and we will meet examples of them in 
practically every chapter of these Notes. They are often denoted by symbols 
like ~, so that their three characteristic properties take the form 

x ~ x, 

x ~ y => y ~ x, 

[x ~ y & y ~ z] =>■ x ~ z. 

4.11. Exercise. On each set A. the identity relation {(x, y) | x = y G A}, the 

identically true relation {(x, y) \ x, y G A} and for each B C A the relation 

x ~a/b y ^=>-df x = y G A V [x G B & y G B] 
are cdl equivalence relations. 

4.12. Proposition. Suppose ~ is an equivalence relation on the set A, let for 
each x £ A, 

[*/~] = {.V G A | x ~ y} (4-7) 

be the equivalence class 8 of x, and let 

\A/~ 1 = {[x/~] G V(A) | xeA} (4-8) 

be the set of cdl equivalence classes. It follows that [x/~] 0 for each x G A, 

and for cdl x, y G A, 

x ~ y ^ [x/~] = [y/~], (4-9) 

xfiy ^ [x/~] n [y/H = 0. (4-10) 


8 Each [x/~] is obviously a set, a subset of A, and it would be more appropriately named the 
“equivalence set" of x, but the classical terminology goes way back and has been frozen. 
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so that [A/~] is a family of non-empty and pairwise disjoint subsets of A such 
that U |A/~] = A. 

Conversely, for each family W of non-empty and pairwise disjoint subsets of A 
such that A = (J if, the relation 

x ~ y -£=>df (BX £ <g)[x £ X & y £ X] 
is an equivalence relation on A and [A/~] = IS . 

Proof. Each [x/~] / 0, since x £ [x/~], By the transitivity and symmetry 
of 

t ~ x & x ~ y => t ~ y, t ~ y & x ~ y =» t ~ x 

so that 

x ~ y => (Vt £ A)[t ~ x t ~ y] 

=► [*/~] = \yh\- 

This implies immediately both (4-9) and (4-10). For the converse, the reflex- 
ivity and symmetry of ~ are trivial. If x ~ y and y ~ z, then there exist sets 
X , Y in W such that x, y £ X, y, z £ Y, so in particular y £ X n Y and since 
the sets in g 3 are pairwise disjoint, we have X = Y, so x ~ z. H 

4.13. Exercise. What are the equivalence classes of the equivalence relations in 
Exercise 4.11? 

Following up the same idea, we identify each ternary relation R on the sets 

A, B, C with the set of triples which satisfy it. so that a ternary relation on A, 

B, C is simply an arbitrary subset of A x B x C. We will use synonymously 
the notations 

R{x, y, z) -€=>df (x, y, z) £ R. 

As with relations, we represent functions in set theory by identifying them 
with their “graphs”. 

4.14. Functions. A function (or mapping or transformation) / : A — > B with 
domain the set A and range the set B is any subset / C (Ax B) which satisfies 
the condition 

(Vx £ A)(3\y £ B)[{x,y) £ /], 

in more detail, 

(Vx £ A) (By £ B)[(x,y) £ /], 
and (x, y) £ f & (x, y') £ f => y = y' . 

If we picture the product A x B in the plane as in Figure 4.2, this means that 
every “vertical line” intersects the set / in exactly one point: for x £ A and 
/ : A — > B, we will write, as usual, 

f(x) =df the unique y £ B such that (x, y) £ f , .. 

= the value of / on the element x 

The function space 
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Figure 4.2. A function as a set of ordered pairs. 

(A - B) = df {/ C A x B | / : A - 5} 

= {/ G 7>U x 5) | (Vx G xl)(3!y G B)(rj) G /} 

is a set by the Subset Axiom (III) . 

We will use all the familiar notations and abbreviations in connection with 
functions, e.g., sometimes writing the argument without the parentheses or as 
an index, 

f(x) = fx = f x . 

The i— > notation is also useful; for example, an indexed family of sets is a 
function 

A — (i i—> Aj)j£/ : I —> E 

for some 7/0 and some E, where each A t is a set. We refer to 7 as the index 
set and we define the union and intersection of the family in the usual way, 

U iei A i =df {x G I (3/ G I)[x G A/]}, ^ 

fl teiAi =df {x G I (Vi G I)[x G A,]}. 

We can also define the product of an indexed family, the set of functions which 
select for each i G 7 one element from the value A,-, 

II ie,Ai =df {/ : I - U, eiA, I (V« G /)[/(/) G A,]}. (4-12) 

Injections, surjections, bijections (correspondences), images and pre-ima- 
ges of functions are defined exactly as in the Introduction. We will be using 
the following notations for the relevant function spaces: 

(A >— ► B) =df {/ G (A — > B) | / is an injection, one-to-one}. 

(A — » B) = df {/ G (A — » B) | / is a surjection, onto Bj, 

(A >-» B) = df (A >-> B) n (A — » B) (bijection, correspondence). 

For each X C A, the restriction f \X of a function / ; A — > B is obtained 
by cutting / down so it is defined only on X, 

f \X = d f {{x,y) G / | x G X}. (4-13) 

It is also useful to notice that the basic condition of “functionhood” 
Function)/) ^ df (3A)(3B)[f G (A 5)] 


(4-14) 
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is evidently definite. When we refer to a function f without identifying specific 
sets A, B such that / : A — > B. wc will mean any set / which satisfies the 
condition Function(/). 

4.15. Exercise. Prove from the axioms, that for each function f , the domain of 
definition of f 

Domain(/) =df {x | (3 y)[(x,y) € /]} 

and the image of f 

Imag e(/) = df {y | (3x)[(x,y) e /]} 
are sets, and for each set B, 

if Image(/) C B. then f : Domain(/) — » B. 

As a consequence, 

if Function(/), then f : Domain(/) — > Image(/). 

4.16. About functions as sets of ordered pairs. This “identification” of a func- 
tion / : A — > B with the set of pairs {(x,y) € A x B \ f(x) = y} has 
generated some controversy, because we have natural “operational” intuitions 
about functions and by “function” we often mean a formula or a rule of 
computation. For example, the two functions on the reals 

f(x,y) =df (x + y) 2 , 
g(x, y) =df x 1 + 2 xy + y 2 

are identified in set theory, although they are obviously different as compu- 
tation rules. There is no problem with this if we keep clear in our minds 
that the “definition” 4.14 does not replace the intuitive notion of function but 
only represents it within set theory, faithfully for the uses to which we put this 
notion within set theory. 9 first (and foremost), to define equinumerosity and 
the size comparison condition for sets with no reference to objects outside 
axiomatic set theory, 

A= c B <t=>df (3/)[/ : A >— » B] 

A < c B <t=>df (3/)[/ : A >— > B] 

A < c B 4=>df A < c B & A f c B 

4.17. Exercise. Prove from the axioms that A = c B =^V{A) = c V{B), and 
identify which of the axioms you are using. 

4.18. Exercise. Prove from the axioms that if A = c A' and B =,, B' , then 
Al±l B = c A' l±l B' , A x B = c A' x B' . (A — > B) = c (A' —> B'). 


(A > — » B) f 0, 
(A >->B)f0. 


9 Whether the intuitive notion of function-as-computation-rule can also be represented faith- 
fully in set theory is an interesting problem, for which there does not exist yet a generally accepted 
solution. 
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Figure 4.3. Cantor’s construction of \A\ for a four-element set. 

4.19. Cantor’s notion of cardinal numbers. Ironically, one of the most difficult, 
intuitive mathematical notions to represent faithfully in set theory is that of 
cardinal number , a most basic concept of the subject. Flere is how Cantor 
introduced it in the same 1895 paper from which we quoted the “definition” 
of sets in the Introduction: 

Every set A has a definite ‘power’, which we will also call its ‘cardinal 
number’. 

We will call by the name ‘power’ or ‘cardinal number’ of A the 
general concept, which by means of our active faculty of thought, 
arises from the set A when we make abstraction of its various elements 
x and of the order in which they are given. 

We denote the result of this double act of abstraction, the cardinal 
number or power of A by A. Since every element x, if we abstract 
from its nature becomes a ‘unit’, the cardinal number A is a definite 
set composed of units, and this number has existence in our minds as 
an intellectual image or projection of the given set A. 

After some discussion, Cantor infers from this “definition” that cardinal num- 
bers have the following two fundamental properties: 

A= c % 

if A — c B. then A — B. 

The first of these flows quite naturally from Cantor’s conception: the process 
of abstraction which associates with each x € A a corresponding “unit” u x 
evidently defines a correspondence x i— > u x between A and A. Cantor gives a 
brief argument for the second whose key phrase is that 

A grows, so to speak, out of A in such a way that from every element 
x of A a special unit of A arises. 

To get the second property out of this we must assume that the “special units 
of A” depend only on “how many” members A has and not the nature of these 
members, which begs the question of cardinality, but there it is. 

There is a third, more technical property of cardinal numbers, which Cantor 
uses routinely with no special mention to define and study operations which 
act on infinite families of sets: for every family of sets if, { A | A e If } is a set. 
Thus, however we understand Cantor’s construction, it is quite clear what we 
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must do to represent it faithfully in set theory. We substitute modern notation 
for Cantor’s awkward double bar symbolism. 

4.20. Problem of Cardinal Assignment: to define an operation \A\ on the class 
of sets which satisfies 

(Cl) A = c \A\. 

(C2) if A = c B. then \A\ = \B\. 

(C3) for each set of sets { | X\ \ X e If} is a set. 

The problem is quite difficult and it was not solved until the twenties, by 
von Neumann, whose elegant construction uses both the Axioms of Choice 
and Replacement. We will present it in Chapter 12, as the culmination of a lot 
of work. In the meantime, notice that there are plenty of definite operations 
which satisfy (Cl) and (C3), including the obvious \A\ = A ! And as it turns 
out, these two properties suffice for the development of a very satisfactory 
theory of cardinality. 

4.21. Cardinal numbers. A (weak) cardinal assignment is any definite opera- 
tion on sets A i-> \A\ which satisfies (Cl), and (C3), and it is a strong cardinal 
assignment if it also satisfies (C2). The cardinal numbers (relative to a given 
cardinal assignment) are its values, 

(C4) CardU) •<==> k g Card ^ df (3A)[/« = \A\\. 

4.22. Exercise. Prove that for any cardinal assignment and any two sets A. B . 

\A\ = \B\=*A= C B , 

so that the converse of (C2) is true of all cardinal assignments, including the 
weak ones. 

We fix one, specific (possibly weak) cardinal assignment and we define the 
arithmetical operations on the cardinals as follows: 

k + X =df \k W X\ — c k l±l X, 

K ■ X =df |k x X\ = c k x X, 

^ =df \(x -»■ k)| =c {X -»■ k). 

The infinitary operations are defined similarly: 10 

=df |{(h^) G I X U,' G / k ; I X G Kj}\i 

El iei K ‘ = df iriie/ K >l- 

The motivation is clear, e.g., the sum k + /. is the “number of elements” in the 
set we get by putting together disjoint copies of k and 


10 It is traditional to use the cap Greek It to denote both the Cartesian product of sets and the 
cardinal operation of infinite product, and it does not really cause any confusion. 
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Notice that there is only one choice for |0|, 

O= df |0| = 0. (4-15) 

since only |0| = 0 satisfies 0 = c |0|. It is also convenient to set 

1 =df | {0} | , 2 =df | {0, 1 } | (4-16) 

so we have handy names for the cardinal numbers of singletons and double- 
tons. 

4.23. Exercise. For all sets A. B, \A\J B\ < c \A\ + \B\,and 

if A n B = 0. then \A U B\ = c \A\ + \B\. 

4.24. Exercise. For all cardinal numbers, if «i = c n,2 and ).\ = c X2, then 

K\ 2 ] = c K2 T fa, fiq * X\ = c re 2 * fa, re | 1 — c K-f . 

4.25. Cardinal arithmetic. It looks quite silly to develop the theory of a 
weak cardinal assignment which could be just the identity \X\ = X, but the 
notation of cardinal numbers and the arithmetical operations on them is useful 
for expressing simply complex “equinumerosities”. Consider the formula 

fa x+)i) = c (4-17) 

It looks obvious, it is true by Exercise 4.28, and it expresses exactly the same 
fact as 


((2 l±J ju) — > re) = c (2 — > k) x {ju — > k), (4-18) 

more simply, or so some would say. More significantly: 

(1) the systematic development of formulas like (4-17) leads to a cardinal 
arithmetic which in the end suggests new (and useful) facts about equinu- 
merosities by analogy with ordinary arithmetic; and 

(2) when we do construct von Neumann’s strong cardinal assignment, we 
will have already proved all the interesting facts about cardinals with = f in 
place of = : all we will need to do is remove in our minds the subscript c from 
facts we already understand, because of the following, simple fact. 

4.26. Exercise. If cardinal numbers are defined using a strong cardinal assign- 
ment, then for all cardinals k, X, 

| ac | = k and n = c X k = X. (4-19) 


The basic technique for proving identities of cardinal arithmetic is to use 
systematically (Cl) and the replacement properties of Exercise 4.18. To prove 
the associativity of cardinal addition, for example, we compute: 


k (A 4“ /i) — c ^ hi ( [X T ju') 
=C K l±l {X l±) ju) 
= c (re l±l X) l±l fi 
= c (re + X) + ju 


by defi. 

by defi, (Cl) and 4.18. 
by a direct argument, 
reversing the steps. 


The mathematical essence of the proof is the alleged “direct argument”, which 
in this case is quite easy: 
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4 . 27 . Exercise. Prove that for any three sets K. L, M, 

K\4)(L& M) = c {K\HL)\a M. 

4 . 28 . Exercise. Prove (4-17), by showing first that for any three sets K. L, M , 

if L Pi M = 0, then {{L U M) -> K) = c (L —> K) x (M — > K). 

To see how the more technical condition (C3) comes into play, consider the 
equation 

IUie/A'1 =c Eie/lAi (4-20) 

which should certainly be true when the sets in the family (i i-> Aj) ie j are 
pairwise disjoint. To make sense of it, before trying to prove it, we must know 
that there is a function (i i— > | Aj\), and the proof of this requires (C3), as 
follows: 

4 . 29 . Lemma. For every indexed family of sets A = (/ i— > A,)/ e /, there exists a 
function f : I —> /[/] such that 

f(i) = \Ai\ (/£/). 

Proof. By (C3) with 

% = {Ai | i£l} = A[I], 

there exists a set W which contains every \ A, | for i £ I, and we can set 

/ =df {('>) e / x W I w = \Ai\}. H 

As it happens, equations like (4-20) cannot be proved without the Axiom of 
Choice, so we will have little need of (C3) before Chapter 8 . 

4 . 30 . Structured sets. A topologiccd space is a set X of points endowed with a 
topological structure, which is determined by a collection of subsets of X 
satisfying the following three properties: 

1. 0,ie 5C 

2. A. B £ => A P B £ IF . 

3. For every family f C of sets in PT , the unionset IJ if' is also in PT . 

A family of sets PF with these properties is called a topology on X , with open 
sets its members and closed sets the complements of open sets relative to X , 
i.e., all X \ G with G open. 

Notions like this of sets “endowed” with structure abound in mathematics: 
there are graphs, groups, vector spaces, sheaves, manifolds, partially ordered 
sets, etc. etc. In each of these cases we have a set X, typically called “the 
space”, and a complex of related objects which impose a structure on the 
space — functions, families of sets, other spaces with their own structure, etc. 
The pairing operation provides a simple and flexible way to model such notions 
faithfully in set theory. 

A structured set is a pair 


U = ( A,S ), 


(4-21) 
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where A = Field (U) is a set, the field or space of U , and S is an arbitrary 
object, the frame 11 of U. 

For example, a topological space is a structured set (X, T), where the frame 
IT is a topology on X, as above. A group is a structured set 

U=(G,(e,-)) (4-22) 

where e G G and • : G x G — > G is a binary function, satisfying the group 
axioms, which do not concern us here. Notice that by the definition of triple 
(4-3), definition (4-22) is equivalent to 

U = (G, e, •). (4-23) 

It is quite common that the frame of a structured set is a tuple of objects, and 

then the structured set is also a tuple, with its field as the first element. We 
will meet numerous examples of this in the sequel. 

Following usual mathematical practice, we will systematically confuse a 
structured set with its field when the frame is understood from the context or 
is not relevant. For example, we will refer to “the topological space X” rather 
than “(XT')”, with “points” the members of X, “subsets” the subsets of X, 
etc. In the general case, the members of a structured set U are the members 
of Field( U), 

x G U <t=>df x G FieldU/), (4-24) 

the subsets of U are the subsets of Field (U), etc. Notice that the termino- 
logical convention (4-24) cannot possibly cause a misunderstanding: since we 
have (deliberately) not settled on a specific pairing operation — and have even 
left open the possibility that (A, S) may be an atom (!) — the statement 

x G (A,S) 

cannot possibly mean anything until we define it, and we just did this by 
(4-24). 


Problems for Chapter 4 

The definition of ordered pair in the proof of 4.2 is due to the Polish 
set theorist and topologist Kuratowski. A few years before Kuratowski’s 
construction, the American analyst Wiener had discovered the following, 
somewhat more complex but interesting solution of this problem. 

x4.1. (Wiener) The properties (OP1) and (OP2) in 4.1 hold with the following 
definition of pair: 

( x >y) =df {{0. {a}}- {{ v}}}. 


11 It would be more suggestive to call S the structure of the structured set but the word 

is heavily overloaded in logic and set theory and it is best to avoid attaching it to one more precise 
notion. Some people call "structures” what we have called "structured sets” here, at least when 
they are simple enough. 
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x4.2. The properties (OP1) and (OP2) in 4.1 hold with the following definition 
of pair: 

(x,y) =df {{0 ,x},{1, y}}, 

where 0. 1 are two distinct objects, e.g., those defined by (4-16). Hint. Com- 
pute (0.0), (0. 1), (1, 1) and (1.0) (with this pair) to see what goes on, and 
then take cases on whether x = y or not. 

x4.3. Prove from the axioms that for all sets A, B, C , 

((A x B) —> C) = c (A->(B-> C)). 

x4.4. Prove from the axioms the theorem of Cantor 2.21, that for every set A, 
A < c V(A). Which axioms do you need? 

x4.5. A binary relation ~ C (A x A) is an equivalence relation on A if and 
only if there exists some set Q and a surjection 

n : A -*► Q (4-25) 

such that 

x ~ y ^==> 7i(x) = n(y). (4-26) 

When (4-25) and (4-26) hold, we call Q a quotient of A by ~ and n a 
determining surjection of The proof of 4.12 yields the quotient \A /~] and 
the determining surjection (x i-> [x/~]), but in specific cases there exist other 
quotients which help us understand better the structure of the equivalence 
relation at hand. 

x4.6. Suppose ~ is an equivalence relation on A and n : A — > A satisfies 

x ~ y=>n{x) = n{y) G [x/~]. 

Prove that n is a determining surjection witnessing that its image n[A\ C A is 
a quotient of A by ~. 

x4.7. Fix an element xo € A in some set and define on the function space 
(A — > B) the relation 

/ ~ g <=>df fix 0 ) = g(x 0 ). 

Prove that ~ is an equivalence relation and find a determining surjection 
7 z : {A ^ B) B which witnesses that B is a quotient of (A — > B) by 

x4.8. Let xo / x\ be two distinct elements in A and find a determining 
surjection which witnesses that ( B x B) is a quotient of (A — > B) by the 
equivalence relation 

/ ~ g «=>df fix o) = g(x 0 ) &/(xi) = g(x i). 

x4.9. Suppose ~ is an equivalence relation on A and / : A — > A is a function 
which respects i.e., 

X ~ y=>f( X ) ~ fiy). 
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Figure 4.4. 

Let O be any quotient of A by ~. Prove that there exists a unique function 
f*'-Q—>Q such that the diagram in Figure 4.4 commutes, i.e., n f = f*n, 

f*(nx) = n(f(x)), ( x G A), 

where n : A — » Q is a determining surjection. 
x4.10. For all cardinal numbers, n, X, ju, 

K + 0 = c IS, K • 0 = c 0, K • 1 =c K. 
x4.11. For all cardinal numbers is, X. ju, 

is + (X + ju) = c (is + X) + ju, 
is, X == c X ~\~ is, 

x4.12. For all cardinal numbers is, X, ji, 

k ■ {X • ju) = c (is • X) ■ /<, 

K ■ X = c X ■ K, 

is • ( X /i) — c is • X 4~ is ' ju . 

x4.13. For all cardinals k, \V(k)\ = c 2 k . 

x4.14. For all cardinal numbers is, X, ju, 

k° = c 1 , k 1 = c k, is 2 = c is ■ is. 

x4.15. For all cardinal numbers is, X, ju, 

(, k-XY = c kV-X* 1 , 
k U+m) — . k m 

rv — q r\j fxi , 

(k 2 Y = c 

x4.16. For all cardinal numbers is, X. fi 

is /i ■ > ' is 4- X G i_i -\- X, 

K <c /< => IS ■ X < c JU ■ X, 

X <c /-l => K 2 <c IS M (is Y o), 

is <c X => K M < c X M . 

For what values of X, /.i does the third implication fail when k = 0 ? 
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Caution! As we will see later, strict inequalities between infinite cardinal 
numbers are not always respected by the algebraic operations, for example, 
we may have 

k < c ju, but k + X = c /i + X. 
x4.17. For all A, B and all cardinals k, X, 

Yi ieA B = {A^B), n, e ^ = ^- 

x4.18. Suppose a ^ b are two distinct objects and n a . K h are cardinals, and 
prove that 

K a + Kb =c ^2i£{a,b} K i’ 

K a ' K b =c riie{a,A} K '- 

x4.19. Prove that for all indexed families of cardinals, 

K ' = c ’Yhi£l K ‘ ' ^i- 

x4.20. Show that k ■ X = 0 k = 0 V X = 0. Show also one of the 

directions of the equivalence 

n,e/«r = 0 ^ (3/ e I)[k, = 0], (4-27) 

(If you think you can show both directions of (4-27), think again and find 
your error, since one direction requires the Axiom of Choice.) 

x4.21. The notion of equivalence according to Zermelo 3.28 coincides with 
equinumerosity, i.e., 

A = c B •<=>■ A ~ z B. 

The definition 2.6 of infinite and finite sets refers to the set N of natural 
numbers and we cannot study these concepts axiomatically before we give a 
definition of N directly from the axioms. There is, however, another, simpler 
definition of these notions which we can give now and which we will later 
prove (with the Axiom of Choice) equivalent to 2.6. 

4.31. Definition. A set A is infinite according to Dedekind if there exists an 
injection 

/ : A >-* B C. A 

from A into a proper subset of itself. If A is not Dedekind-infinite, then it is 

Dedekind-finite. 

x4.22. If A is Dedekind-infinite and A = c B, then B is also Dedekind-infinite. 
x4.23. If A is Dedekind-finite, then every subset of A is also Dedekind-finite. 
x4.24. Every set / which satisfies the conditions 

0 G I, (Vx)[x G / => {x} € /] 


is Dedekind-infinite. 
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Figure 4.5. Zermelo’s proof of the Schroder-Bernstein Theorem. 

Most of the properties of Dedekind-finite sets require the Axiom of Choice 
for their proof. Here is one which does not. but it requires some thinking. 

*x4.25. If A is Dedekind-finite and t £ A, then A U {?} is also Dedekind -finite. 
Hint. Assume n : A U {t} >-> A U {t} misses some point in A U {?}, and 
consider cases on whether it misses t ; or nit) = t: or “otherwise”, which is 
where the thinking is required. 

The classical proof of the Schroder-Bernstein Theorem 2.26 uses induction 
on the natural numbers and we cannot justify it now. In the next two problems 
we outline a very different proof (due to Zermelo), somewhat opaque in its 
motivation but elegant, short and in no way dependent on the natural numbers. 

*x4.26. If A' C B C A and A = c A 1 , then also A = c B. Hint. Suppose 
/ : A >—» A ' is a correspondence which witnesses that A = c f[A\ = A 1 , and 

Q = B\ f[A] 

is the set of objects in B which are not in the image of A by /. We define the 
family of subsets of A 

F = {X\QUf[X]CX} 

and we first verify that its intersection is a member of it, 

T= df n^e^, 

so that Q U f[T] C T . With a bit more work we can show that, in fact, 
T = Q U f[T]\ this identity then implies that 

B = Tu{f[A]\f[T]), 

which completes the proof, since T and [f[A] \ f[T ]) are disjoint sets and 
their union is (easily now) equinumerous with A. 

*x4.27. Use Problem x4.26 to give a proof of the Schroder-Bernstein Theorem 
from the axioms. Hint. If / : A >-* C and g : C >-> A, then 

^ =c gf[A\ C g[C] c A, g[C] = c C. 
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CHAPTER 5 


THE NATURAL NUMBERS 


Our fundamental intuitive understanding of the natural numbers is that there 
is a (least) number 0, that every number n has a successor Sn, and that if we 
start with 0 and construct in sequence the successor of every number 

0, S0= 1, SI =2, S2 = 3, ... 

forever, then in time we will reach every natural number. In set theoretic terms 
we can capture this intuition by the following axiomatic characterization. 

5.1. Definition. A Peano system or system of natural numbers is any structured 
set 

(N, 0, S) = (N, (0. S)) 

which satisfies the following conditions. 

1. N is a set which contains the element 0, 0 € N. 

2. S is a function on N, S : N — > N. 

3. S is an injection, Sn = Sm =>• n = m. 

4. For each n G N, Sn f 0. 

5. Induction Principle. For each ICN, 

[0 G X8c(Mn G N)[n G X=>Sn G X]] => X = N. 

These obvious properties of the natural numbers are called the axioms of 
Peano in honor of the Italian logician and mathematician who first proposed 
them as an axiomatic foundation of number theory, following their earlier 
formulation by Dedekind. Most significant among them is the Induction 
Principle, whose typical application is illustrated in the proof of the next 
lemma. 

5.2. Lemma. In a Peano system (N, 0, S), every element n f 0 is a successor, 

if n f 0, then (3m G N)[« = Sm], 
and for each n , Sn f n . 

Proof. To prove the first assertion by the Induction Principle, it is enough 
to show that the set 

X = {n G N | n = 0 V (3m G N)[n = Sm]} 
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satisfies the conditions 

0 G X, (Vn G N)[« G X => Sn G X ], 

and both of these are obvious from the definition of X. In the same way, for 
the second assertion it is enough to verify that SO ^ 0 (which holds because, 
in general, Sn ^ 0) and that Sn ± n => SSn ^ Sn: this holds because S is 
one-to-one, so that SSn = Sn =$■ Sn = n. H 

Number theory is one of the richest and most sophisticated fields of math- 
ematics and it is by no means obvious that it can be developed on the basis of 
these five, simple properties; in fact they do not suffice, one also needs to use 
set theory which (in its naive form) Peano took for granted, as part of “logic”. 
Here we will only show that the axioms imply the first, most basic properties 
of addition, multiplication and the ordering on the natural numbers, which 
is all we need. The proofs we will give, however, are characteristic examples 
of the use of the Peano axioms in the more advanced parts of the theory of 
numbers. 

If number theory can be developed from the Peano axioms, then to give a 
faithful representation of the natural numbers in set theory, it is enough to 
prove from the axioms the following two theorems. 

5.3. Theorem (Existence of the natural numbers). There exists at least one Pe- 
ano system (N, 0, S). 

5.4. Theorem (Uniqueness of the natural numbers). For any two Peano systems 
(Ni, 0i, Sj) and (N 2 , 0 2 , «S 2 ), there exists one ( and only one) bijection 

n : Ni >— » 

such that 

7t(0i) =0 2 , 

Ti(Sin) = Sin^n) (. n G Ni). 

A bijection n which satisfies these identities is an isomorphism of the two 
systems (Ni,0i, Sj) and (N 2 , 0 2 , S 2 ), so that the theorem asserts that any two 
Peano systems are ( uniquely ) isomorphic . 

The Existence Theorem is very simple and we can prove it immediately. 

5.5. Proof of the existence of natural numbers. 5.3. The Axiom of Infinity (VI) 
guarantees the existence of a set / such that 

0 G I. 

( dn)[n G / => {n} G /]. 

Using this I, first we define the family of sets 

S = {X C / | 0 G V&(V«)[« G X=>{n} G X]} 
so that obviously / G -J , and then we set 

N = n^. 0 = 0, S = {(n.m) G N x N | m = {«}}. 
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Figure 5.1. The Recursion Theorem. 

To finish off the proof, it suffices to verify that this triple (N, 0. S ) is a Peano 
system. To begin with, N G J r , because X G => 0 G X and hence 
0 G H S = N, and by the same thinking, 

n G N => (VX G <X)[n G X] => (VX G J r )[{«} G X] => {«} G N. 

This implies immediately the first two of the Peano axioms, the next two 
hold because (in general, for all n, m) {«} = {«;} =>n = m and {«} / 0, 
and the Induction Principle follows directly from the definition of N as an 
intersection. H 

To prove the Uniqueness Theorem 5.4. we need the next fundamental result 
of axiomatic number theory. 

5.6. Recursion Theorem. Assume that ( N . 0, S) is a Peano system, E is some 
set, a G E, and h : E — > E is some function-, it follows that there exists exactly 
one function f : N — > E which satisfies the identities 

/( 0) = a, 

f(Sn)=h(f(n)) (n G N). 

The Recursion Theorem justifies the usual way by which we define functions 
on the natural numbers, by recursion 12 (or induction): to define / : N — > E, 
first we specify the value / (0) = a and then we supply a function h : E — > E 
which determines the value f{Sn) of / at every successor Sn from the value 
/(n) at its predecessor n, f(Sn) = h(f(n)). Our basic intuition about 
the natural numbers with which we started this chapter clearly justifies such 
definitions, so we should also be able to justify them on the basis of the axioms. 

Before we establish the Recursion Theorem, let us use it in the next proof 
which is a typical example of the way it is applied. 

5.7. Proof of the uniqueness of the natural numbers. 5.4. We assume that 
(Ni, 0i, Si) and (N 2 , 0 2 , S 2 ) are Peano systems. By the Recursion Theorem on 


l: The terms “recursion" and “induction” are often used synonymously in mathematics. We 
will follow (he more recent convention which distinguishes recursive definitions from inductive 
proofs. 
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(Ni,0i,Si) with E = N 2 , a = 0 2 , h = S 2 , there exists exactly one function 
n : Ni — > N 2 which satisfies the identities 

rc(Oi) = 0 2 , 

n(S\n) = 5 2 7t(«) (n G Ni), 

and it suffices to verify that this n is a (one-to-one) correspondence. 

(1) n is a surjection, n : Ni — » N 2 . Obviously 0 2 G 7t[Ni] since 0 2 = 7i(0i), 
and 

m G 7r[Ni] => (3/1 G Ni)[m = 7t{n)] 

=> (3/i G Ni)[S 2 /h = Sin(n) = 7r(«Sjzi)] 

=> Sjm G 7t[Ni], 

so that by the Induction Principle on (N 2 , 0 2 , S 2 ), 7r[Ni] = N 2 . 

(2) n is an injection, n : Ni >— > N 2 . It suffices to verify that if we set 

X = {/; € Ni | (V//i € Ni)[k(/m) = n(n)=$m = «]}, 


then 

OiGX n £ X => Sin € X, 

since together with the Induction Principle on (Ni,0i,Sj), this implies that 
X = Ni, which means that n is an injection. For the first condition, 

m ^ Oi => m = S\tn' for some m' , by Lemma 5.2 
=> n(m) = n{S\m') = S27i{m') ^ 0 2 , 

so that if n(m) = 7r(0i) = 0 2 , then m = Oi and Oi G X. For the second 
condition, it is enough to show that 

n G X &n(m) = n(S\n) =>m = Sin. 


By the hypothesis 

n(m) = n (Sin) = S 2 7t(/i) ^ 0 2 , 

which implies that m ^ 0\, since 7t(0i) = 0 2 and Oj G I. By Lemma 5.2 
again, m = S\m! for some m' G Ni, 

7 z(m) = n(S\m') = S27i(m') 

and the hypothesis n(m) = n(S\ii) yields 

S27i(m') = 5 2 7r(«), 

which implies n(m') = n(n). This, in turn, implies m' = n because n G X, so 
that m = S\m' = S\n, the required conclusion. 3 
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5.8. Proof of the Recursion Theorem. Assume the hypotheses and define first 
the set ,>/ of all approximations of the function which we want to construct: 

p G sf <^=>df Function(p) (5-1) 

&Domain(p) C N&Image(p) C E 
&0 G Domain(p) &p(0) = a 
&(Vu G N)[S« G Domain(p) 

n G Domain(p) & p{Sn) = h(p(n))]. 

In words, each p G srf is a function with domain a subset of N and values in E : 
by the third clause, Domain(p) contains 0; and by the last one, Domain(p) is 
“closed downward”, i.e., if Sn G Domain(p), then n G Domain(p), and the 
value p{Sn) is determined by pin). Some examples of approximations are 

{(0, a)}. {(0. a), (SO, /(«))}, {(0. a), (SO, /(«)), (550, /(/(a)))}. . . . 

and they suggest how the required function / is “built up” stage-by-stage by 
the recursive definition. To prove the theorem, we need to show (from the 
axioms only, without “ . . . ” or “etc.”) that there is exactly one approximation 
with domain of definition all of N. 

Lemma. For all p.q G stf and n G N, 

n G Domain(p) n Domain)*/) p(n) = q(n). 

Proof. The set 

X = {n G N | (Vp, q G st) n G Domain(p) n Domain(^) ==> p(n) = q(n) } 

clearly contains 0, since every p G si satisfies p(0) = a. If 

n G X & p G $/ & q £ sf & Sn G Domain(p) n Domain)*/), 

then 

p(Sn) = h(p(n)) because p G stf, 

= /;(*/(«)) because p(n) = q(n), 

= q(Sn) because q G sf , 

so that if n G X, then Sn G X. It follows by the Induction Principle that 
X = N, which completes the proof. H (Lemma) 

The Lemma implies immediately that at most one function / : M E 
belongs to sf, so to complete the proof of the theorem we need only show that 
at least one such / exists. This is the union 

/ = (Jjrf = {(«> w) | (3 p G st)[n G Domain(p) & p(n) = w]}, 
which is a function, because 

(n, w) G / & («, w') G / => (3 p, q G j/)[(«, w) G p & («, w') G */] 

=> w = w' from the Lemma, 
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and then the definition of sf and a similar calculation shows that / G s/. 
The only thing left is to verify that Domain(/) = N, and for that we will 
use once more the Induction Principle. To begin with. 0 G Domain(/), since 
0 G Domain(p) for every p G si and / 0. If n G Domain(/), then there 
exists some function with n G Domain(p), and hence (easily) 

q = p U {(S«, h(p(n)))} G Jrf, 

so that Sn G Domain(^) C Domain)/). H 

5.9. The natural numbers. We now fix a specific Peano system (N. 0, S ) whose 
members we will henceforth call natural numbers, or just numbers, when there 
cannot be any confusion. Following Cantor, we denote the cardinal number 
of N by the first Hebrew letter with the subscript 0, pronounced “aleph-zero”, 

H 0 =df |N|. (5-2) 

Later we will meet its followers H 1; K 2 , etc. Functions a : I I — > A with 
domain N are called (infinite) sequences and we often write their argument as 
a subscript. 

a„ = a{n) {n G N, a : N — > A). 

An obvious choice for N would be the system which we constructed in the 
proof of the Existence Theorem 5.3, where 0 = 0, 

N = {0.{0},{{0}},...} 

and Sn = {/;}. Another choice which some would prefer on philosophical 
grounds is to assert that there exists, in fact, a set 

N = {0,1,2,...} 

of the “true natural numbers”, which are not sets, and the successor function 
S is nothing like the artificial ( n i— > {«}), but it is the “natural successor func- 
tion” which associates with each number n “the next number” Sn. Zermelo’s 
theory allows such non-sets (like the “true numbers”) as atoms, and requires 
only one thing: that the system of natural numbers satisfies the Peano axioms, 
something which every serious person will surely grant. As far as the math- 
ematical theory of numbers and sets is concerned, these two (and all other) 
choices of the objects we will call numbers are equivalent, since we will base 
all our proofs on the Peano axioms alone. 

5.10. The Schroder-Bernstein Theorem 2.26. At this point, we should review 
the proof of this important result and verify that we can now derive it from 
the axioms: this is because the recursive definitions of the sequences of sets 

and {5„}„ 6N are justified by the Recursion Theorem, and their basic 
properties are established by induction and simple manipulations of functions, 
which can all be based on the results in Chapter 4. 

There are many reformulations of the Recursion Theorem which are useful 
in applications. We state here just two of them, whose proofs illustrate how the 
theorem is used. Two more such results are included in the problems, x5.20 
and x5.21. 
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Figure 5.2. The Recursion Theorem with parameters. 

5.11. Corollary (Recursion with parameters). For any two sets Y, E and func- 
tions 

g : Y E, h : E x Y E, 

there exists exactly one function f : N x Y — > E which satisfies the identities 

f(0,y) = g(y) (y s Y), 

f(Sn,y)=h(f(n,y),y) ( y G XhGN). 

Proof. For each y G Y, we define the function h y : E E by the formula 

h y (w) = h(w,y), 

and by the Recursion Theorem we know that there exists exactly one function 

fy-.n^E 

which satisfies the identity 

f y (o) = g(y), 

fy{Sn) = hy{fy{n)) = h(fy(n),y). 

It follows immediately that the function / : N x Y — > E defined by the 
formula 

f(n,y) =df fy(n) (y € Y,n G N) 

satisfies the conclusion of the Corollary. H 

5.12. Corollary (Recursion with the argument as parameter). For every set E , 
each a G E, and every function h : E x N — > E , there is exactly one function 
f : N — > E which satisfies the identities 

/( 0) = a. 

f(Sn ) = h{f{n),n). 

Similarly , with parameters, for every 

g : Y -► E, h : E xN x Y -> E, 
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there exists exactly one function f : N x Y — > E which satisfies the identities 
f(0,y) = g(y) (y £ Y), 

f(Sn,y) = h(f(n,y),n,y) (y £ Y n £ N). 

Proof. For the version with parameters, we define first a function 

f-.Nx F-> N x £ 

by recursion with parameters 5.11, where the component functions First and 
Second are those of Exercise 4.3: 

<£(0, y) = (0,g(y)) 

4>(Sn, y) = (5First(0(n, y)). A(Second(0(«, y)), First(0(n, y)), y)). 

By induction on n, immediately, 

First(0(«, y)) = n, 

and so the function $ satisfies the equations 

0(0 ,y) = (0, g(y)), <j>(Sn,y) = (Sn, h(Second(<j)(n, y)),n, y)). 

This implies immediately that the function 

f(n,y) = Second(^(n, y)) 
satisfies the required identities. 

The uniqueness is easy to prove directly by induction on n. H 

We now use these results to define and establish the basic properties of 
addition, multiplication and the ordering on N. 

5.13. Addition and multiplication. The addition function on the natural num- 
bers is defined by the recursion 

n + 0 = n, ,, ,, 

n + ( Sm ) = S(n + m), 

and multiplication is defined next, using addition, by the recursion 

% 0 = ?’ s , (5-4) 

n ■ Sm = \n ■ m ) + n. 

In more detail, we know from 5.11 that there exists exactly one function 
/ :Nxff^N which satisfies the identities 

/(O.u) =g(n), 
f(Sm,n) = h(f(m,n),n), 

where the functions g and h have been given as sets of pairs, 
g = {(«, n) £ N x N | n £ N}, 
h = {((z, n), w) £ (N x N) x N | w = 5r}, 
and we define addition by 

n + m =df f(m, n), 
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i.e., + = {((«, m),w) | (( m,n),w ) G /}. Such scholastic details do not 
enhance understanding (rather the opposite) and we will avoid them in the 
future. 

5.14. Theorem. Addition is associative, i.e., it satisfies the identity 

{n + m) + k = n + (m + k) (5-5) 

Proof. First for k = 0, 

( n + m) + 0 = n + m = n + (m + 0). 

using twice the identity w + 0 = w directly from the definition of addition. 
Inductively, assuming that for some k 

( n + m) + k = n + ( m + k), (5-6) 

we compute: 

(n + m) + Sk = S((n + m) + k) 

= S(n + (. m + k)) by (5-6) 

= n + S(m + k) 

= n + ( m + Sk) 

where the steps we did not justify follow from the definition of addition. H 
The commutativity of addition is not quite so simple and requires two 
lemmas. 

5.15. Lemma. For every natural number n. 0 + n = n. 

Proof. By induction, 0+0 = 0 follows from the definition, and if 0 + n = n, 
then 0 + Sn = S( 0 + n) = Sn. H 

5.16. Lemma. For all n. m, n + Sm = Sn + m. 

Proof. By induction on m , first for in 0. immediately from the definition: 
n + SO = S{n + 0) = Sn = Sn + 0. 

At the induction step, we assume that for some m 

n + Sm = Sn + m (5-7) 

and we must show that 

n + SSm = Sn + Sm. 

Compute: 

n + SSm = S{n + Sm) by the definition 
= S(Sn + m) from (5-7) 

= Sn + Sm by the definition. H 

5.17. Theorem. Addition is a commutative function, i.e., it satisfies the identity 

n + m = m + n. 
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Proof is by induction on m, the basis being immediate from Lemma 5.15. 
At the induction step, we assume that for some specific m 

n + m = m + n (5-8) 


and we compute: 


n + Sm = S(n + m) 
= S(m + n) 
= m + Sn 
= Sm + n 


by the definition 

from (5-8) 

by the definition 

by Lemma 5.16. H 


5.18. Exercise. For every natural number n. the function ( s *—> n + s) is one-to- 
one, so that n + s = n + t ==> s = t, and in particular 

if n + s = n, then s = 0. 


5.19. Definition. A binary relation < on a set P is a partial ordering if it is 
reflexive, transitive and antisymmetric, i.e., for all x, y, z e P. 

x < x (reflexivity) , 

x < y & y < z =>• x < z (transitivity), 

x < y & y < x => x = y, (antisymmetry). 

In connection with partial orderings we will also use the notation 

x < y x < y&x f y. 

The partial ordering < is total, or linear, or simply an ordering, if, in addition, 
any two elements of P are comparable in <, i.e., 

(Vx, y G P)[x < y V y < x], 

or equivalently 

(Vx, y G P)[x < y V x = y V y < x]. 


5.20. Definition. The binary relation < on P is a wellordering of P, if it is a 
total ordering of P and, in addition, every non-empty subset of P has a least 
element, 

(VX C P)[x f 0=>- (3x G X)(Vj G X)[x < y]]. 

Correct English would have us call these “good orderings”, and in fact this is 
what they are called in every other language, but the awkward “wellordering” 
has been established so firmly that it is hopeless to try and change it. 

5.21. Definition. The order relation < on the natural numbers is defined by 
the equivalence 

n < m <^=>-df (3j)[h + s = m]. 


The most basic property of < is: 

5.22. Lemma. For all natural numbers n,m, 

n < Sm n < m V n = Sm. 
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Proof. If n < Sm, then there is some t such that n + t = Sm by the 
definition, and we consider two cases. Case (1), t = 0. Now n + 0 = Sm, 
hence n = Sm. Case (2), n + t = Sm for some t f 0. Now, by Lemma 5.2, 
t = Ss for some s, so that n + Ss = Sm. hence S{n + s) = Sm, hence 
n + s = m because S is an injection and hence n < m. The converse direction 
of the Lemma is easier. H 

5.23. Theorem. The relation < on the natural numbers is a wellordering. 

Proof. Reflexivity is immediate from n + 0 = n and transitivity holds 
because n + s = m &m + 1 = k => n + (s + t) = k. To prove antisymmetry, 
notice that if n + s = m and m + t = n, then n + (s + t) = n and by Exercise 
5.18 we have s + t = 0; this implies t = 0 (otherwise s + t is a successor) and 
hence m = n . 

Proof of linearity. We show that (Mn)[n < m V m < n], by induction on m. 
Notice first that for every n, n < Sn, because n + SO = Sn. 

Basis. For every n, 0 + n = n and hence 0 < n. 

Induction step. We assume the induction hypothesis 

(Mn)[n < m V m < n] 

and show that for each n, n < Sm V Sm < n. The induction hypotheses 
naturally splits the proof up in two cases. If n < m, then n < Sm because 
m < Sm and < is transitive. If m < n. then for some t.m + t = n, and again 
we have two cases: if t = 0. then n = m < Sm, and if t f 0, then t = Ss for 
some 5, so m + Ss = n and from Lemma 5.16 Sm + s = n, hence Sm < n. 

Proof of the wellordering property. Towards a contradiction, suppose that 
X is non-empty but has no least element and let 

Y = {n £ N | (Mm < n)[m df]}, 

so that obviously 

Ffll = 0. (5-9) 

It is enough to show that 0 e Y and n € Y ==> Sn G Y, because then Y = N 
by the Induction Principle and hence X = 0 by (5-9) , which is a contradiction. 

Basis. Oe 7. Since 0 is the least number, we must have 0 ^ X (otherwise 
X would have a least member) and also m < 0 =>• m = 0 =» m £ X, so 
0 G Y. 

Induction Step. The induction hypothesis n G Y and the definition of 
Y imply that (Mm < n)m £ X, and then we know from Lemma 5.22 that 
m < Sn => m < n V m = Sn. Hence to verify that Sn G Y, it is enough to 
show that Sn £ X. But if Sn were a member of X, then it would be the least 
member of X since 

m < Sn ^==> m < n by Lemma 5.22 
=> m £ X by the ind. hyp. 

This shows that < has the wellordering property and completes the proof of 
the theorem. H 
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About wellorderings, in general, we will say a lot in Chapter 7. In the 
special case of the natural numbers, the fact that N is well ordered by < is 
another manifestation of the Induction Principle. 

Before we begin studying the applications of recursion to the theory of 
finite and countable sets, we should recall the warning issued in 3.24: some 
of them require the Axiom of Choice and we will not be able to justify them 
axiomatically until we add that axiom to our system in Chapter 8. Most, 
however, can be established by judicious applications of the general method 
of proof which can be symbolized by the coupling 

recursive definition — inductive proof. 

First we repeat some of the definitions of Chapter 2, with the axiomatic 
notions now at our disposal. 

5.24. Definition. For any two natural numbers n < m, the (hall'-opcnj interval 
from n to m is the set 

[n. m) =df {k € N | n < k& k < m}. 

5.25. Exercise. For each n, [ n , n) = 0, and for all n < m, 

[n, Sm) = [n, m ) U {«;}. 

5.26. Definition. A set A is finite if there exists some natural number n such 
that A = c [0, «); infinite if it is not finite; and countable if it is finite or 
equinumerous with N. By Proposition 2.7 (which follows easily from the 
axioms), 

A is countable •<=>• A < c N. 

The finite cardinals are the cardinal numbers of finite sets. 

The next crucial property of finite sets is the first, basic result in the field of 
combinatorics. 

5.27. Pigeonhole Principle. Every injection f : A >— > A of a finite set into itself 
is also a surjection, i.e., f[A] = A. 

Proof. It is enough to prove that for each natural number m and each g. 

g : [0./«) >— > [0, m) =>g[[0, m)\ = [0, m), (5-10) 

for the following reason; if / : A >—> A is an injection and 71 : A >-» [ 0, m) 
witnesses that A is finite, we define g : [0. m) — ► [0, m) by the equation 

g(i) = n(f(n~ l (i))) ( i<m ), 

so that (easily) 

f(x) = n~ l (g(n(x))) (xeA). (5-11) 

Now g is an injection, as a composition of injections, so that from (5-10) it 
is a bijection; but then / is also a bijection, because it is a composition of 
bijections, by (5-11). 
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0 1 2 


0 1 2 • • • v ■ ■ ■ m - 1 m 

Figure 5 . 3 . Case (3) in the Pigeonhole Principle proof. 

The proof of (5-10) is (naturally) by induction on m. It is important to 
note at the outset that what we will show is the general assertion 

(Vg) [g : [0 ,m) >-► [0. m) => g[[0, m)\ = [0. m)\ . (5-12) 

because in the verification of the induction step for some g we will need the 
induction hypothesis for various other g’s. 

Basis. (5-10) is trivial when m = 0, 1, because there is only one function 
g : [0. m) — > [0. m) in these cases and it is a bijection. 

Induction Step. We assume (5-12) for some m > 1 and we proceed to 
prove that every injection 

g : [0, Sm) >— ► [0, Sm ) 

is a surjection. From Exercise 5.25 we know that 

[0. Sm) = [0. m) U {/«}. 

and the proof naturally splits into three cases. 

Case (1). m ^ Image(g). Consider the restriction h of g to the interval 
[0. in), which is defined by 

h(i) = g(i) ( i<m ), 

i.e., as a set of pairs, h = g \ {(m,g(m))}. This takes all its values in [0, m) 
and it is certainly an injection, so the induction hypothesis holds for it and 
hence /;[[0, m )] = [0, m). This means that g[[0, m)] = [0. m), which is absurd, 
because the Case Hypothesis implies g(m) < m so that the value g{m) is 
taken on twice and g is not an injection. 

Case (2). g(m) = m. By the same reasoning, the restriction h is a bijection 
h : [0, m) >-» [0. m), and hence (trivially now) g is also a bijection. 

Case (3). There exist numbers u, v < m such that 

g(u) = m, g{m) = v. 

In this most interesting case, we define the function h' : [0, m) —> [0, m) by 
the formula 

f g(i), if / < m&i ^ u, 

//(/) = < -r . 

[ v, if i = u. 

Now W is an injection, because it agrees with g at all arguments except u, 
where it takes the value v: and v ^ g(j), for every j < m. because g{m) = v 
and g is an injection. The induction hypothesis applied to h' implies that 
h'[[ 0, m )] = [0, m), and using this (easily), g[[0, Sm)] = [0, Sm). H 
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As a first application we can give a rigorous proof of the following “obvious” 
result. 

5.28. Corollary. The set N of natural numbers is infinite. 

Proof. The function {n i— > Sn) is an injection of N into N \ {0}. H 

It follows that “infinite, countable” means precisely “equinumerous with 
N”, in accordance with our basic intuitions: a set A is countable, infinite just 
when \A \ = c Mo- 

5.29. Corollary. For each finite set A. there exists exactly one natural number 
n such that A = c [0. n). We let 

#{A) =df the unique n G N such that [A = c \A \ = c [0, n)] (5-13) 

and we naturally call #(A) the number of elements of A. 

Proof. If A = c [0. n) and A = c [0, m) with n < m, then [0. n) = c [0, m) and 
the correspondence n : [0. m) >— > [0, n ) contradicts the Pigeonhole Principle, 
since [0. n) is a proper subset of [0. m). H 

From this point on we can proceed to prove all the basic properties of finite 
sets by induction on the number of elements in them, which is essentially their 
cardinal number. The method is illustrated in the problems. 

5.30. Strings. In Chapter 2 we used the //-Ibid Cartesian product A n to rep- 
resent sequences of length n from a given set A. This is not convenient when 
we wish to study the set of all finite sequences from A, and it is better to 
represent these using functions on initial segments of N. For each set A, we 
define the set of finite sequences, words or strings from A by 

am — df { u G N x A | Function(«) & Domain)//) — [0 , /7 ) j- , , _ , 

A*= df U n °V w > ( } 

and we let 

lh(w) = d f max{/ | i = 0 V i — 1 G Domain(w)} (w G A *); (5-15) 

this is the length of the string u. so that lh (u) =0 exactly when u is the empty 
string 0. We also let 

uQv *^=>df u C v (u,v € A*), (5-16) 

and we call u an initial segment of v if u C v. If a 0 - • • • , a n -\ are elements of 
A, we let 

(«o. • • • ,a„~ i) =df {(0, a 0 ), . . . fin - l,a„_i)} (5-17) 

be the sequence of these objects and, in particular (with n = 0, 1), 

()=0. <a)=df{(0,a)}. (5-18) 

For any two strings u,v, the string 

u-kv = (u( 0), .... u(lh(u) — 1), u(0), .... u(lh(u) — 1)) 
is the concatenation of the strings u and v. 


(5-19) 
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For each / : N — > A and each natural number n, 

f(n) =df / f[0,w) = {(i,/(0) \ i<n} (5-20) 

is the restriction of / to the initial segment [0, n) of N. For example, 

7(o) = 0, 7(i) = {(o,/(o))} 

and we can recover / from /, since 

i < n=>f(i) = /(«)(/). 

5.31. Definition. For each cardinal number k and each n G N, we set 



5.32. Proposition. For each countably infinite set A and each n > 0, 

A= c AxA= c A m = c A*. 

As equations of cardinal arithmetic, these read: 

K 0 = c K 0 -Ko= c Ho" = c |N 0 *|. (5-21) 

Proof. The inequalities from left to right are trivial, so by the Schroder- 
Bernstein Theorem it is enough to show N* < c N. We need to start with some 
injection 

/;:NxNmN, (5-22) 

and let us first suppose that we have one. Using it. we define by recursion an 
injection n n : N ( " + 0 >— > N, for each n, so that 

7Io(«) = u( 0), 

n„+\{u) = p(n„{u ([0, n + 1)), u(n + 1)); 

in full detail (for the last time), this comes from the Recursion Theorem, by 
setting 

Ko = {(w, w) | u € N (1 \ (0, w) € u}. 
n „+ 1 = {( u , w) | u G N^" +2 \ w = p{n n (u ([0. n + 1), u[n + 1))}. 
Finally, the function 

7i(u) = (lh(«) - \,n Mu) _ x {u)) 

proves that (J 7oN ( " +1) < c N x N, from which the full result follows immedi- 
ately by using p once more. As far as choosing a p to start with, everyone has 
their favorite way of coding pairs and Cantor’s illustrated in Figure 2.2 will 
certainly do. Here is another one. due to Godel and pictured in Figure 5.4: 

{ (m + l) 2 — 1, if m = 72. 
n 2 + 772, if 777 < 77, 

772 2 + 777 + 77, if 72 < 777. 

The proof that it actually works is fun (Problem x5.24). H 
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Figure 5.4. A pairing ofN x N with N. 

It is sometimes useful to think of A* as a generalization of N. generated 
from the empty string ( ) (instead of 0) by iterating the appending operations 

S a (u) = u*(a) (a £ A), 

one for each a £ A. This picture motivates the following, simple but useful 
method of definition by recursion on A*. 

5.33. String Recursion Theorem. For any two sets A, E, each a £ E and each 
function h : E x A —> E, there is exactly one function f : A* —> E which 
satisfies the identities 

/(<))=«. 

f(u*(x)) = h(f(u),x ) (h £ A*,x £ A). 

Similarly , with parameters, for any 

g-.Y^E, h : E x A x Y -> E, 

there exists exactly one function f : A* x Y — > E which satisfies the identities 

f({),y)=g(y) (>’ e t), 

f(u*(x),y)=h(f(u,y),x,y) ( u £ A* ,y £ Y,x£A). 

Proof. For the version with parameters, we appeal to Corollary 5.12 to 
obtain a function : N x d* x Y — > E which satisfies the identities 

d>(0 .u.y) = g{y), 

<f>(n + 1, u, y) = h{c/){n, u, y), u{n), y), 

and we set 

f(u,y) = <A(lh (u),u,y). 

Clearly /(( ), y) = <j>(0, u, y) = g(y), so / (u, y) satisfies the first of the two 
required identities. To see that it also satisfies the second, we show by an easy 
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induction on n that for any two strings u, v, 

(Vi < n)[u(i) = v(z)] =$■ <j>(n, u, y) = <j>(n, v,y); (*) 

this is immediate at the basis since <f>(0, u, y) = g(y) does not depend on u, 
and in the induction step, assuming that (Vi < n + 1)[m(i) = u(i')], 

<j)(n + 1, u, y) = h(<j>(n, u, y), u(n), y) 

= h(<l>(n, v , y), v(n), y) (ind. hyp. and hyp.) 

= <f>(n + l,v, y). 

Now, using (*), we compute, for any u with lh(w) = n and any x G A, 
f(u*{x),y) = 4>(n + 1, u*{x),y) 

= h(<t>(n, u -k(x),y),x, y) 

= h(<j)(n, u, y), x, y) (by (*)) 

= h(f(u,y),x,y) (by the def. of / ) 

The uniqueness of / (u, y) is easy, by induction on lh(«). H 

5.34. The continuum. The classical notation for the cardinal of 'P(N) uses the 
German (Fraktur) letter ‘c’, 

c=df |P(N)|= f 2 H o. (5-23) 

It is quite easy to establish the basic facts about c using the properties of K 0 in 
(5-21) and elementary cardinal arithmetic. For example: 

c • c — c 2 Ho • 2 Ho = c 2 Ho+H ° = c 2 H o = c c. 

The Schroder-Bernstein Theorem is also very useful: for example 

c = c 2^° < f K 0 Ho < c = c (2 H «) H o = c 2^o =e 2 Ho _ c , 

which by Schroder-Bernstein implies that 

C=c H 0 Ho = c c H °. 

Some of the problems ask for computations of this type. On the other hand, 
the equinumerosity R = c V(N) will follow easily from the axioms once we have 
defined the real numbers M in Appendix A, so that the Continuum Flypothesis 
is equivalent to the proposition 

(CH) (Vk < c c)[k < c H 0 V k = c c]. 

We will discuss CH extensively in Chapter 10, it is not that easy to resolve. 

Problems for Chapter 5 

x5.1. Multiplication on the natural numbers is associative. 
x5.2. Multiplication on the natural numbers is commutative. 
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x5.3. Exponentiation on the natural numbers is defined by the following 
recursion on nr. 

n°= 1, 
n Sm = n m ■ n. 

Show that it satisfies the following identities (for n ^ 0): 

n (m+k) = n m ■ n k , 
n [mk) = (; n m ) k . 

x5.4. Suppose (Ni.Oi.Si) and (N 2 .O 2 .S 2 ) are two Peano systems, +i, -i, 
+ 2 , -2 are the functions of addition and multiplication in these systems, and 
n : Ni >— » N 2 is the “canonical” isomorphism between them defined in the 
proof of Theorem 5.4. Show that n is an isomorphism with respect to addition 
and multiplication also, i.e., for all n. m e N ] . 

n[n +1 m ) = n{n) +2 n(m), n(n -i m) = n{n) -2 n(m). 

x5.5. Suppose (Ni, 0i. Si) and (N 2 , 02 , S 2 ) are two Peano systems, < 1 , <2 are 
the respective wellorderings and n : Ni >-» N 2 is the canonical isomorphism. 
Show that n is order preserving, i.e., for all n. m e Ni, 

n <1 m 7 z(n) <2 7 i{m). 

x5.6. Every subset B of an interval [0. n ) is equinumerous with some [0, m), 
where m < /;. It follows that if A is finite and B C A, then B is finite and 
#(B) < #{A). 

Every cardinal number is a set; a finite cardinal n is a cardinal number which 
is a finite set. 

x5.7. Prove that for every finite cardinal number k. 

k= c [0, #(«)). 

x5.8. Show that for all n, m. [0, m) = c [ n . n + m) and infer that the union of 
two finite sets A. B is finite and such that 

if A n B = 0. then #{A U B) = #(A) + #(B). 

It follows that for any two finite cardinals k, X, 

#(« + X) — #(k) + #(X). 

x5.9. WS is a finite set and every member of it is a finite set, then the unionset 
(J is also finite. 

x5.10. The product of two finite sets A. B is finite and such that 
#{A x B) = #(A) ■ #(B). 

It follows that for any two finite cardinals k, X , 

#{k-X) = #{k) ■ #(X). 
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x5.11. The powerset of every finite set A is finite and 

#(P(A)) = 2 #(a) . 

It follows that for every finite cardinal re, 

#(2 k )= 2 #m . 

x5.12. For all finite cardinals re, A, 

#(re A ) = #(re) #w) . 

x5.13. Show that if A is finite, then every surjection / : A — » A is an injection. 
(This is an alternative version of the Pigeonhole Principle.) 

x5.14. For all cardinals re, 2 k f c H 0 . ( Careful: we have not proved the 
Comparability Hypothesis 3.1, and so you cannot appeal to it.) 

x5.15. c + c = c H 0 • c = c c • c = c c. 

x5.16. c c = c 2 C . 

x5.17. For every cardinal number re > c 1, if re ■ re = c re, then 2 K = c re K . 
x5.18. For each cardinal number re and each re G N, 

re" = c re^ 0 ’"^ , 

where the left side is defined by 5.31 and the right side is cardinal exponentia- 
tion. 

x5.19. For each n ^ 0. c" = c |c*| = c c. 

x5.20 (Simultaneous Recursion Theorem). For any two sets E\, £Y elements 
a\ e E x . ci 2 £ E 2 and functions h\ : E\ x E 2 -> E\, h 2 : E\ x E 2 — > E 2 , there 
exist unique functions 

fi:N^E u f 2 :N^E 2 

which satisfy the identities 

/i(0) = a\, / 2 ( 0) = a 2 , 

/i(« + l) = hi(fi(n),f 2 (n)), fiin + 1) = h 2 (fi(n),f 2 (n)). 

*x5.21 (Nested Recursion Theorem). Prove that for any three functions 
i:£xNxf-*£, and 7r : N x F — > Z 
there is exactly one function /:Nx f-i£ which satisfies the identities 

/ (0. y) = g(y), f(Sn.y) = h(f(n,n(n,y)),n,y). 

Hint. Define recursively a function 4 > '■ N — > (N x (N — > Y)) such that the 
required function is f{n,y ) = Second(</>(«))(j). 

x5.22. Give a recursive definition of the factorial 

f (re) = 1 • 2 • • • (re — 1) • re (with /(0) = 1. conventionally). 
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x5.23. Give an explicit definition of the function / : A* — » A* on the strings 
from some set A which is defined by the string recursion 

/«» = (>. f{u*{x)) = {x)*f{u). 

x5.24. The function p in the proof of 5.32 is a bijection. 

*x5.25. Every partial ordering < on a finite set P has a linearization, i.e., some 
linear ordering <' of P exists such that x < y => x <’ y. 

*x5.26. The marriage problem. Suppose B is a finite set and h : B — > V(G) is 
a function, such that for each x G B, h(x) is a finite subset of G and 

X C B =>■ #(X) < #(U {h(x) I X G X}), (5-24) 

so in particular each h(x) ± 0. Prove that there exists an injection / : B >-* G 
such that 

(Vx G B)[f{x) G h{x)]. (5-25) 

Show also that the hypothesis (5-24) is necessary for the existence of some 
injection / which satisfies (5-25). Hint. Take cases, on whether or not there 
exists some 0 7 ^ C C B such that #(C) = #((J {h(x) \ x G C}). The name 
of the problem comes from the traditional interpretation, in which B is a set 
of boys, G is a set of available girls and h assigns to each boy x the (finite) 
set h(x) of girls that he would be willing to marry. There are many other 
applications of the problem, more useful and less sexist, for example when B 
is a set of courses, G is a set of professors and h assigns to each course the set 
of professors who can teach it (“the scheduling problem”). 

The next problem justifies one more form of recursive definition which is 
often useful in applications. 

x5.27. Complete Recursion. For each h : E* — > E, there exists exactly one 
function / : N — > E which satisfies the identity 

/(«) = h{f{n))\ 

similarly, for each h : i?*xN — > £\ there exists exactly one function / : N — > E 
which satisfies the identity 

f(n) = h(f («), n). 

The next problem gives a characterization of countable, infinite sets directly 
in terms of the membership relation, with no appeal to the defined notions of 
N and “function”. 

x5.28. Prove the equivalence: 

A = c N •£=>■ (3£?)[A = &0 £ S’ & (V« G 8?)(3!y ^ u)[u U {y} G W] 

&(VZ)[[0 GZ& (Vm G Z)(3!j $ u)[u U {y} G Znf]]=»i’ C Z]]. 
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The primary significance of the Recursion Theorem 5.6 is foundational, since 
the result justifies on the basis of the axioms a method of definition of functions 
which is intuitively obvious. From a purely mathematical point of view, 
however, we can also view 5.6 as a theorem of existence and uniqueness of 
solutions for systems of identities of the form 

/( 0 ) = a, 

f{Sx) = h{f (x)) (x € N), 

where a € E and h : E —> E are given and the function / : I I — > /: is 
the unknown. In this chapter we will prove an elegant generalization of the 
Recursion Theorem in the context of the theory of partial orderings, which 
implies the existence and uniqueness of solutions for systems of functional 
identities much more general than (6-1). The Continuous Least Fixed 
Point Theorem 6.21 is fundamental for the theory of computation , it is the 
basic mathematical tool of the so-called fixpoint theory of programs. In the 
next chapter we will show that it is a special case of a much deeper Fixed Point 
Theorem of Zermelo, which is intimately related to the theory of wellorderings 
and rich in set theoretic consequences: for example it implies directly the 
Hypothesis of Cardinal Comparability, 3.1. Thus, in addition to its purely 
mathematical significance, the Continuous Least Fixed Point Theorem yields 
also an interesting point of contact between classical set theory and today’s 
theoretical computer science. 

In their simplest and most natural expressions, the Fixed Point Theorems 
are somewhat abstract and apparently unrelated to the solution of systems 
of functional identities to which we intend to apply them. To understand 
what they say and how to use them, we will need to introduce first some basic 
notions from the theory of partial orderings. 

6.1. A partially ordered set or simply poset is a structured set 

P = (Field(P),< P ), 

where Field f C ) is an arbitrary set and < P is a partial ordering on Field) P), 
i.e., a reflexive, transitive and antisymmetric binary relation. Notice that < P 
determines P because it is reflexive, 

x € Field(P) •<=>• x < P x, 
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Figure 6.1. A discrete and a flat poset. 

so we can specify a poset P completely by defining its partial ordering < P . In 
practice, however, the partial ordering < P is often clear from the context and 
we will tend to identify a poset P with its field Field(F), following the general 
convention about structured sets discussed in 4.30. For example, by the poset 
N we obviously mean the pair (N. <), where < is the usual ordering on the set 
of natural numbers. By this convention, the points of P are the members of 
Field(F), a subset I C P is a subset / C Field(F), etc. Each / C P is a poset 
in its own right, partially ordered by the restriction of < P to I, 

x </ y *^=>-df x < p y & x € / & y € /, (6-2) 

which is (easily) a partial ordering. We will often deal with posets which have 
a least element, and it will be convenient to use the same, standard symbol 
-L (read “bottom” or “least”) for it, just as we use the same symbol 0 for the 
additive unit of every number system: 

_L = _L/> = df the least element of P (if it exists). (6-3) 

Any set A can be viewed as a discrete poset in which no two elements are 
comparable, i.e., partially ordered by the identity relation 

x < y x = y (v, y € A). 

Just above these in complexity are the flat posets which have a least element, 
the only element involved in any comparisons: i.e., 

x <p y x = 1 V x = y. 

The simplest non-empty poset is a singleton { J_}, which is both discrete and 
flat. Additional examples of posets are the sets of natural numbers N, rationals 
Q and reals R, with their usual orderings. These are all linear (totally ordered) 
posets. There is a large variety of finite posets and their study constitutes an 
important research area of mathematics, but we will not be much concerned 
with it here. Mostly we will use them as counterexamples. In drawing posets 
we indicate x < y by placing y above or to the right of x and drawing a polygonal 
line (sometimes directed, to avoid ambiguity) from x to y, which may pass 
through other points, e.g., c < e in Figure 6.2. 13 

6.2. Definition. Let P be a poset, S C P and M £ P a member of P. 


13 There are those who draw posets growing to the right, those who draw them growing up and 
even those who draw them growing down: to the best of my knowledge, nobody pictures posets 
growing to the left and it does not appear that any of the three common choices is dominant. 
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Figure 6.2. A finite poset 

1 . M is an upper bound of S if it is greater than or equal to every element 
of S, i.e., if (Vx G S ) [x < M], 

2. M is maximum in S if it is a member and an upper bound of S, i.e., if 
M G S & (Vx G .S' ) [x < M], 

3. M is a least upper bound of S if it is an upper bound and also less than 
or equal to every other upper bound of S, i.e.. if 

(Vx G S)[x < M] & (VM')[(Vx G S)[x < M'\ => M < M'\. 

If M \ , M 2 are both least upper bounds of S, then M\ < M 2 (because M 2 is 
an upper bound and M\ is a least upper bound) and symmetrically M 2 < M\, 
so that Mi = M 2 , i.e., S has at most one least upper bound. When it exists, 
the least upper bound of a set S is denoted by 

sup S = the least upper bound of S. (6-4) 

The term “sup” from the Latin supremum (maximum) is justified by the 
following observation. 

6.3. Exercise. If M is maximum of a set S in a poset P, then M is also the least 
upper hound of S. 

6.4. Exercise. In the poset of Figure 6.2, find subsets S with the following 
properties : (1)5 has no upper bound. (2) 5 has upper bounds but no least upper 
bound. (3) 5 has a least upper bound but no maximum element. 

6.5. Exercise. In any poset P. an element M is the least upper bound of the 
empty set 0 if and only if M is the least element of P. i.e., 

_L = sup 0 (6-5) 

if _L or sup 0 exists. 

6.6. Exercise. The powerset V(A) of every set A is partially ordered by the 
relation 

X C A Y df X C Y C A. 

so that _L = 0 and for everv S C V(A), the union (J 5 is the least upper bound 
ofS , 14 


14 The partial ordering of P(A) is the restriction Cj of the definite condition X C Y to P(A), 
and we often skip ihe subscript in referring to it, cf. 4.8. 
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E 


Figure 6.3. A partial function / : A — - E. 

Less trivial and more interesting for our purposes is the next example of a 
poset. 

6.7. Definition. A partial function on a set A to a set E is any function with 
domain of definition some subset of A and values in is, in symbols 

/ : A — >• E 4=4>df Function(/) &Domain(/) C ^4& Image)/) C E. (6-6) 

For example, (n i— > ( n - 1)) is a partial function on the natural numbers 
defined only when n / 0, (x i— > /x) is a partial function on the reals with 
domain of definition [x \ x > 0}, etc. A finite sequence u G A* is a 
partial function u : M — A. as we defined it in 5.30. The empty set 0 is 
(trivially) a partial function (with empty domain of definition!) and every 
(total) function on A to E is also a partial function, since (6-6) does not 
exclude Domain)/) = A, 

ti-A^E, f : A —> E => f : A ^ E. 

We will use systematically the convenient half-arrow notation for partial func- 
tions (recently established in computer science) , as well as the common nota- 
tions 

f{x) l 4=>df x G Domain)/), f{x) ] x £ Domain(/) (6-7) 

to indicate that a partial function is defined or undefined at some point; we 
sometimes read / (x) j as / (x) converges or / converges at x, and / (x) f as 
/ (x) diverges or / diverges at x. 

6.8. Definition. For each A and E , 

(A - E) = df {/ C A x E | / : A - E } (6-8) 

is the set of all partial functions from A to E, in analogy with the notation 
(A — > E) for the set of all (total) functions from A to E, cf. 4.14. The set 
(A E) is partially ordered by the relation C, 

f Qg (Vx G A)[f(x) i =>[g(x) | &/(x) = g(x)]]. 



with least element _L = 0. 
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6.9. Exercise. For each A, E, 

(A E) = {f \X \ f : A —> E &X C A}. 

Function restrictions are defined in (4-13). 

It is harder to find least upper bounds for sets in these partial function posets 
than in powersets: if for example, S C (N — *■ N) contains the two (total) 
constant functions x >-> 0 and rnl, then any upper bound / : N — N of 
S would satisfy both /(0) = 0 and / (0) = 1, which is absurd. On the other 
hand, linearly ordered subsets of (A E) have least upper bounds, and this 
is a fruitful property of these posets, worth a name. 

6.10. Definition. A chain in a poset P is any linearly ordered subset S of P, 
i.e., a set satisfying 

(Vx, y G S)[x < y V y < x], 

A poset P is chain-complete or inductive if every chain in P has a least upper 
bound. 

6.11. Exercise. The empty set is ( trivially ) a chain, hence every inductive poset 
P has a least element _L = sup 0. 

6.12. Exercise. Every flat poset is inductive ; a discrete poset is inductive only 
when it has exactly one element, in which case it is also flat. 

6.13. Exercise. The image {x n \ n G N} of a non-decreasing sequence 

Xo < Xi < X '2 < 

is a chain: thus, every non-decreasing sequence has a limit in an inductive poset, 
lim„x„ =df sup{x„ | n G N}. (6-9) 

6.14. Proposition. (1) For each set A, the powerset V(A) is inductive. 

(2) For any two sets A, E, the poset (A E) of cdl partial functions from A 
to E is inductive. 

(3) For every poset P, the set 

Chains(F’) = {S C P \ S is a chain} 
of cdl chains in P (partially ordered under C) is inductive. 

Proof. (1) is immediate by Exercise 6.6. (2) If S C (A — >■ E) is a chain, 
then the union (J S is a partial function and obviously, IJ S = sup S. (3) This 
is also proved by observing that the union of a chain of chains in a poset is 
also a chain. H 

6.15. Exercise. Neither N (with the usual ordering ) nor the finite poset of Fig- 
ure 6.2 are inductive. 

6.16. Exercise. For each set E, the set P = E* U (N — > E) of finite and infinite 
sequences from E is an inductive poset, under C. 

6.17. Exercise. For any two sets A, E, the poset 

(A y-+E) = df {/ G (A — ^ ■ E) | / is one-to-one} 
of partial injections from A to E (partially ordered by C) is inductive. 
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We will find the most significant applications of the fixed point theorems in 
partial function posets; but the proofs will use only the fact that these posets 
are inductive, and there are lots of other interesting examples. Some of them 
are described in the problems at the end of the chapter. 

Finally, we need to delineate the type of functions on inductive posets which 
must, necessarily, have fixed points. 

6.18. Definition. A mapping 15 n : P — > Q on a poset P to another is monotone 
if for all x, y G P, 

x <p y =>n(x) <q n{y). 

A monotone mapping need not be strictly increasing in the sense of 
x <p y => n{x) <q n{y), 
e.g., every constant mapping is monotone. 

Notice that if n : P — > Q is monotone and S C P is a chain, then the image 
?r[S] is also a chain; because given x = n{u), y = n(v) with u. v G S, either 
u < v , which implies x = n(u) < n(v) = y, or v < u, which similarly implies 
y < x. This makes the next definition meaningful. 

6.19. Definition. A monotone mapping n : P — > Q on an inductive poset 
to another is countably continuous if for every non-empty, countable chain 
S CP, 

^(supS) = sup7i[S]. 

6.20. Exercise. A monotone mapping n : P — > Q on one inductive poset to 
another is countably continuous if and only if for every non-decreasing sequence 
xo <p X] <p ... of elements in P. 

7t(lim„ x n ) = lim„ n(x„). 

Flere the limit on the left is taken in P and the limit on the right is taken in Q. 

6.21. Continuous Least Fixed Point Theorem. Every countably continuous, 
monotone mapping n : P — > P on an inductive poset into itself has exactly one 
strongly least fixed point x*, which is characterized by the two properties 

n{x*) = x * , (6-10) 

(Vy G P)[n(y) < y=^x* < y]. ( 6 - 11 ) 

Proof. The orbit of the least element J_ under n is defined by the simple 
recursion on the natural numbers. 

x 0 = -L. 

Xn+l 7l(x n ) . 


15 It is convenient to refer to ji : P — » Q as a “mapping” rather than a “function” (which 
means the same thing), because in the interesting applications P is some partial function space 
(A — *■ E). Q may be another partial function space, n takes partial functions as arguments 
and (possibly) values and there are altogether too many functions around. Notice also that, 
pedantically, n : Field(P) — » Field(£)) is a mapping from the field of P to that of Q. 
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Figure 6.4. The Continuous Least Fixed Point Theorem. 

Clearly x 0 = X < xi, and by a trivial induction (using the monotonicity of 
n), for every n, x n < x n+ \. Thus, the limit 

x* =df lim„ x n = sup {x„ | n € N} (6-12) 

exists by 6.13, and by the countable continuity of n, 

n(x*) = 7t(lim„ x n ) = lim„ n(x„) = lim„ x n+ \ = x* . 

For the second claimed property of x*, we assume n(y) < y and show by 
induction that for every n, x n < y. Basis. .v 0 = i < y. trivially. Induction 
Step. The Induction Hypothesis gives us x n < y, and we compute: 

x n < y => n{x„) < n(y), (because n is monotone), 

=> x n+ \ < 7t(y) < y, (by the assumption on y). 

Thus, y is an upper bound of the set of values \x n \ n e N} of the sequence, 
and hence, x* = sup{x„ « e N} < y. H 

To apply the Continuous Least Fixed Point Theorem, we must formulate 
the problem at hand as a question of existence and (sometimes) uniqueness 
of solutions for an equation of the form n(x) = x, where n : P — > P is 
monotone and countably continuous on some inductive poset P. This is 
typically the hardest part: to bring the problem in a form in which 6.21 can be 
applied. Verification of the countable continuity of n is not necessary : because 
we will show in the next chapter that 6.21 remains true if we simply remove the 
hypothesis of countable continuity of n. In any case, most applications involve 
simple monotone mappings on partial function posets for which it is often 
trivial to recognize a much stronger, natural continuity property. 

6.22. Definition. A partial function g : A E is finite if it has finite domain, 
i.e., if it is a finite set of ordered pairs. 

A mapping n : (A — - E ) — > {B — >■ M) from one partial function space into 
another is continuous, if it is monotone and compact, i.e., for each / : A — >■ E, 
and all y € B and v € M, 

n{f){y) = u=>(3/ 0 C f)[f 0 is finite &n(f Q ){y) = v]. (6-13) 
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The notation is a bit convoluted, but what it means is quite simple: to 
compute n{f){y), we first compute the partial function f = n(f) and then 
we evaluate it at y, n{f){y) = f'(y): a monotone mapping n is continuous 
if each value n(f)(y) of n{f), whenever defined, depends only on finitely 
many values of /. In fact, we can combine the conditions of monotonic- 
ity and compactness in the following, simple characterization of continuity 
for partial function space mappings which often allows the immediate — “by 
inspection” — recognition that they are continuous. 

6.23. Proposition. A mapping n : (A — *• E) —> (B — - M) is continuous if and 
only if it satisfies both (6-13) and its converse , i.e., if for every f : A — ^ E and 
cdl y G B and v £ M: 

n{f){y) = v (3/ 0 C f)[f 0 is finite &n(f 0 )(y) = v\. (6-13*) 

For example, the mapping n : (N — N) — > (N — > N) defined by 

n(f) = (« ^ /(«) +/(» 2 )) 

is continuous, because (obviously) for every / and n. 

n{f){n) = n(f 0 )(n), 

where /o is the restriction of / to the two-element set \n. n 1 }. 

Proof of 6.23. Suppose first that n is continuous, / : A — *• E, y e B, and 
v € M. If n(f)(y) = b, then by the compactness of n, there exists some finite 
/o C / such that n(fo)(y) — v; and if such a finite foQf exists, then the 
monotonicity of n implies that n{f){y) = v. 

For the converse, assume that (6-13*) holds for every / : A E and 
all y e B,v e M. This gives immediately the compactness of n. To verify 
that n is monotone, suppose / C g and n{f){y) = v; by the => direction 
of (6-13*). there is some finite foQf such that n(fo)(y) = v; but now 
/o C g, and so by the ■<= direction of (6-13*), n{g)(y) = v. H 

6.24. Exercise. Show that the mapping n : (N — ^ N) — > (N — ^ N) defined by 

7t(/) = (n <->■ 2” =0 /(i)) 
is continuous. Compute n(n i— > 2n)(2) for this n. 

6.25. Definition. A function / : X — > Y from one topological space to an- 
other is (topologically) continuous, if the inverse image f~ l [G ] of every open 
subset of Y is an open subset of X. The definition of a topological space was 
sneaked in 4.30, as the first example of a structured set. 

6.26. Exercise. A function f : X —> Y from one topological space to another 
is continuous if and only if the inverse image f~ l [F] of every closed subset of Y 
is closed in X. 

One might guess that our using the term “continuous” in Definition 6.22 is 
not entirely accidental and that the notion of 6.22 has something to do with 
topological continuity. Indeed it does: the notions are equivalent when the 
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proper topology is put on partial function posets, but we will have no need of 
this fact and will leave it for Problem x6.21. 

6.27. About topology. General (pointset) topology is to set theory like parsley 
to Greek food: some of it gets in almost every dish, but there are no great 
“parsley recipes” that the good Greek cook needs to know. Many notions 
and results of set theory are connected to topological ideas, but it is quite rare 
that you can prove an interesting theorem about sets by quoting some deep 
topological result. To avoid getting distracted with side issues, we will follow 
the general policy of giving the most direct, set theoretically natural definitions 
and proofs of the notions and results we need and leave the connections 
with topology for the problems. Occasionally the most natural approach is 
topological. 

6.28. Lemma. IfS C (A — >• E) is anon-empty chain in a partial function poset 
and fo C sup S is a finite function, then there exists some g G S such that 
fo C g. 


Proof is by induction on the number of elements in the domain of fo. 
Basis, fo = 0 is the partial function which is nowhere defined. There is 
some g G S since S is non-empty, and 0 C g. 

Induction Step. The domain of / 0 has n + 1 elements, so 

/o = /iU {(*,«’)} C sup A, 

where f \ is a finite, partial function with just n elements in its domain, and 
by induction hypothesis, there exists some g\ G S such that f \ C g,. Since 
(x, w ) G sup S', there must also exist some h' G S such that (a, w) € h' , and 
since S is a chain, either gi C or h' C gp the g we need is the larger of 
these two partial functions. H 

6.29. Lemma. Every continuous mapping tz : (A — >• E) — > (B — >■ M) is count- 
ably continuous, in fact, for every (not necessarily countable) non-empty chain 
S C (A — E), 

7i (sup S ) = sup7r[S]. 

Proof. Supposed C (A — ■* E) is a non-empty chain with union / = supS. 
If g G S, then g < /, and since n is monotone we have 7i(g) < n{f), so that 

sup7i[S] = sup {71(g) I g G S} < n(f). 

For the converse inequality, we need to show that if n{f){y) = v, then there 
exists some g G S such that n{g)(y) = v. By the continuity of n, there exists a 
finite fo C /, such that already 7t(/o)(y) = v, by the preceding Lemma, there 
exists some g G S so that fo C g; and by the monotonicity of n, this implies 
7t (/ 0 ) C 7t(g). In particular, since n{f o)(y) = v, we have n{g){y) = v, so 
this is the g we need. H 

The Continuous Least Fixed Point Theorem is evidently a simple corollary 
of the Recursion Theorem on the natural numbers 5.6. In fact, it implies 5.6 
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by a fairly direct argument, which is worth looking at, as it illustrates how we 
intend to apply 6.21. 


6.30. Proof of the Recursion Theorem from 6.21. For each given a € E and 
function h : E —> E, we define the mapping 

n : (N — E) — > (N — E) 


by the formula 


n{f) = /', where /'(x) = 


a, if x = 0. 

h[f[x — 1)), if x > 0, 


where / is any partial function from N to E and we understand the definition 
naturally, so that 

x > o => [/'(x) | <*=>• h(f(x - 1)) | <*=>■ f{x - 1) |], 


Written out in detail, the mapping n associates a set of pairs f C (N x E) 
with every / e (N — *■ £) and it is defined by the equation 


n{f) = {(0. a)} 

U {(x, h(w)) | x > 0& (x — 1, w) € fj if : N — *■ E). (6-14) 

From this we get that for every / and x, 

n{f)ix) = 7i(/o)(x), 

where / 0 = {(0,a)} if x = 0 and f 0 = {(x - 1 ,/(x — 1))} if x > 0, so that 
7 z is continuous by Proposition 6.23, and hence countably continuous. Thus 
by 6.21, it has a fixed point: that is, some partial function /* : N — >• E exists 
which satisfies f* = n{f*), so that, immediately, 

/*(0) = a, (6-15) 

f*(x + 1) = h{f*{x)) (/*(x) |). (6-16) 

Theorem 6.21 does not guarantee that this f* is a total function, with domain 
of definition the entire N, but this can be verified by an easy induction on x 
using the identities (6-15) and (6-16). H 


Consider next a case where it is not quite so obvious how to define the 
function we want directly by the Recursion Theorem. 

6.31. Proposition. For each function h : N — » N and each infinite set A C N of 
natural numbers, there exists a [total) function f : N — * N which satisfies the 
identity 


fin) 


0, if n € A, 

h[f [n + 1)), if n £ A. 


(6-17) 


Proof. We define the mapping 


7i : (N — N) -> (N — N) 
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on the inductive poset of all unary, partial functions on N by the formula 


n{f) = /', where /'(n) 


0, if n € A, 

h{f{n + 1)), if n f A. 


(6-18) 


In full detail, this means we set 


n{f) = {(«,0) | n € A} U {(n,h(w)) \ n ^ A & (n + 1, w) € /}. 

which implies by inspection that 7i is continuous, hence countably continuous. 
Thus we must have a fixed point / which satisfies (6-17), and it is enough to 
prove that this /is total. Assume towards a contradiction that /(«)t for some 
n. Notice that by (6-17), this means that n f A, else /(«) J., in fact, f{n) = 0. 
We will prove by induction on i that / ( n + i ) f, which implies again that for 
all i. n + i £ A, so that A C [0. n ) is finite, contradicting the hypothesis. 
Basis. If i = 0, then f(n + 0) = /(«) f. by assumption. Induction Step. 
Assume that /(« + i) f, so that by (6-17), once more, n + i ^ A. Now this 
implies that / (« + i) = h(f {n + i + 1)) so that / (n + i + 1) j / (n + 1) | 

(since h is total), which violates the induction hypothesis. H 

6.32. Exercise. Prove in detail that the mapping n in this proof is continuous. 
As a third, typical application of the Continuous Least Fixed Point Theorem 

we consider the Euclidean algorithm. 

6.33. Proposition. (1) There exists exactly one partial function f : N x N — - N 
with domain of definition {( n,m ) | n.m f 0} which satisfies the following 
identities for all 0 < n < m: 

fim.n ) = f(n.m), 

/(»,«) =n, (6-19) 

f(n,m) = f(n,m — n). 

(2) The unique f* which satisfies the system (6-19) computes the greatest 
common divisor of any two natural numbers different from 0, 

f*(n,m) = gcd (n.m) (6-20) 

= the largest k which divides evenly 
both natural numbers n. m. 


Proof. With each partial function / : N x N — *■ N we associate the partial 
function /' : N x N — *• N which is defined by the formula 

{ f(m,n), if n > m > 0, 

n, if n — m > 0. 

/ (, n , m — n) if 0 < n < m, 

and we set 

n{f)=f. 

The mapping n : ((N x N) — >• N) — > ((N x N) ^ N) is continuous, by 
inspection. It follows that there exists a least partial function f* : (NxN) N 
which satisfies 


n(f*)=r. 
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Figure 6.5 

and this is (easily) equivalent with the system (6- 19). Proof that for all //. m f 0 
f*{n,m) l & f*(n,m ) = gcd (n.m) 

is by induction on the sum n + m. (Take cases whether n > m > 0. n = m > 0 
or 0 < n < m. and use the simple property of the natural numbers, that for 
0 < n < m, the common divisors of n, m are precisely the same as the common 
divisors of n. m — n.) 3 

In this example we do not need the Least Fixed Point Theorem to prove the 
existence of a solution for the system (6-19), since we can verify directly that the 
function gcd is a solution. Despite this, the proposition is important because 
it yields a characterization of the function gcd which suggests a specific — and 
simple — method for computing it. For example, using only the identities of 
the system, we compute: 

gcd(231, 165) = gcd(165,231) = gcd(165,66) = gcd(66, 165) 

= gcd(66, 99) = gcd(66, 33) =gcd(33.66) 

= gcd(33, 33) = 33. 

This computation of the value gcd(231, 165) is much simpler than the triv- 
ial one, where we would search for the greatest common divisor by testing 
in sequence all the numbers from 165 moving down, until we would find 
some common divisor of 165 and 231. The example is quite general: the 
characterization of a partial function / as the least solution of a system of 
simple identities typically yields an algorithm, a “recipe” for the “mechani- 
cal” computation of the values of /, and this is the underlying reason for 
the significance of the Continuous Least Fixed Point Theorem in theoretical 
computer science. 

We end with a simple result about graphs which is related to the ideas of 
this chapter, see Problems x6.16 and x6.17. 

6.34. Definition. A graph is a structured set (G, — ><y), where the set of edges 
— C G x G is an arbitrary binary relation on the set of nodes G. The 
transitive closure of a graph G is the graph G = (G, =>g), where 

x =>(, y <=X- t ir there is a path from x to y in G 

<=> (3 z 0 , . . . ,z„)[x = zq zj & • • • &r„_! z n = y]. 
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We draw graphs much like posets, except we forget about the convention of 
“growing up or towards the right”: x — >g }’ holds if there is an arrow from x 
to y, and x => G y holds if you can move from x to y along the arrows of the 
diagram. In Figure 6.5 we have / — > /, a => a and a => c, but f fir d. 

6.35. Proposition. For each graph G , the transitive closure relation =>g satisfies 
the equivalence 

x =>g )’ x — >g y V (3 z G G)[x — >g s&z => g y], (6-21) 

Proof. Suppose first that (skipping the conjunction signs) 

X = Z 0 — Z 1 — >G z 2 > G • • • ¥ G 

if n = 1, we have x — >g y , and if « > 1. then x — > G z i and zi =>g )’ (by the 
definition of =»g)> so we have the right-hand sideof (6-21), takings = z\. The 
converse is equally simple, taking cases on the two disjuncts of the right-hand 
side. 3 


Problems for Chapter 6 

x6.1. For every partial ordering < on a set A, the converse relation 

x <' y -<==>df y < x 

is also a partial ordering. Of the inductive posets (A — - E) and V{A ), which 
one has an inductive, converse poset? 

x6.2. Suppose <e is an inductive partial ordering on the set E, A is a set and 
< is the “pointwise” partial ordering on the function space (A — > E), 

f <g <=>df (Vx G A)Lf(x) < E g(x)] {f,g : A-> E). 

Prove that < is an inductive partial ordering on (A —> E). 

x6.3. If the partial orderings < i , <2 on the respective sets Pi . /F are inductive, 
then the following relation < on the Cartesian product Pi x P 2 is also inductive: 

(xi,x 2 ) < (.Vi, yi) <=>df Xi <1 y 1 &x 2 <2 yi- 

With this partial ordering, the poset P\ x P 2 is called the product of the two 
posets Pi and P 2 . 

x6.4. Suppose P\. P 2 . Q are inductive posets. A mapping n : Pi x P 2 — > Q is 
separately monotone if for each at g P\. the mapping (x 2 i-> n{x\, x 2 )) on P 2 
is monotone, and symmetrically for each x 2 G P 2 . Prove that n is monotone 
(on the product poset) if and only if it is separately monotone, 

x6.5. Suppose Pi, P 2 , Q are inductive posets. A mapping n : P\ x P? — > Q 
is separately countably continuous, if for each xi G Pi, the mapping (x 2 i-> 
n{x\, x 2 )) on Pi is countably continuous, and symmetrically for each x 2 G P 2 . 
Prove that n is countably continuous if and only if it is separately countably 
continuous. 
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6.36. Definition. A point M is maximal in a subset S of a poset P if M is a 
member of S and no member of S is bigger, 

M £ S & (Vx € S)[M < x => M = x ]. 

A point m is minimal in S if it is a member of S and no member of S is smaller, 

m £ S & (Vx £ S)[x < m => x = i n], 

x6.6. Find in the poset of Figure 6.2 a subset S which has a maximal element 
but no maximum and another subset S' which has a minimal element but no 
minimum. 

*x6.7. Every finite, non-empty subset of an arbitrary poset P has at least one 
maximal and one minimal member. 

*x6.8. A finite poset P is inductive if and only if it has a least element. 

An important notion in computer science is that of a stream, for example 
the stream of bytes in a file transmitted over the telephone lines to my home 
computer from the University of Athens CYBER. A stream is basically a 
sequence, but it may be infinite , in the idealized case; terminated, if after some 
stage an end-of-file signal comes and my machine knows that the transmis- 
sion is done; or stalled, if after some stage the bytes stop coming, without 
warning, perhaps because the CYBER died or the telephone connection was 
interrupted. 16 

6.37. Definition. For each set A, we fix some t £ A (for example, the object 
r(A) of (3-4)) and we define the streams from A by: 

Streams (A) 

=df W : N — > A U {?} | (Vz < j)[a{j) | =M<r(z) I &a(i) fi ?]]}. 

We call a stream a terminated or convergent if for some n, a(n) = t, in which 
case, by the definition Domain(er) = [0. n + 1): infinite if Domain)^) = N; 
and stalled if Domain(cr) is a finite, initial segment of N but a does not take 
on the terminating value t. The infinite and stalled streams together are called 

divergent. 

x6.9. For each set A, the set of streams Streams)^) is an inductive poset under 
the natural, partial ordering C, where, as for strings, 

<x C r ^=^df o’ C r. (6-22) 

What are its maximal elements? 

x6.10. The concatenation cr*r of two streams is defined so that if cr is divergent, 
then a-kx = a and if a is convergent with domain [0. n + 1), then 

i < n ==> (<t*t)(z) = <t(z), (ct* x)(n + i) = r(z). 


16 Well. the CYBER has died completely since the first edition of these Notes, but the newer 
machines and more robust telephone lines still quit unexpectedly on some occasions! 
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Prove that * is a continuous function (of two variables) on Streams)^). 

The full Least Fixed Point Theorem can be proved directly and easily for 
powersets: 

*x6.11. Suppose n : V(A) — » V{A) is a monotone mapping on a powerset. 
Prove that the set 

A* = f]{X | n(X)C X} 
is the least fixed point of n, and 

A* = UR I X C n(X)} 
is the largest fixed point of n. 

The next few problems deal with “algorithmic” applications of the Least 
Fixed Point Theorem. 

x6.12. For each relation R C N x A. there exists a least partial function 
/ :Nxd-^N such that 

J R(n.x) => f{n,x)=n, 

\ ^R{n.x) ==> f(n,x)=f(n + l,x). 

It follows that 

f(n,x)i ■<=>■ (3 m>n)[R(m,x), 

f ( n , x) | => / («, x) = the least m > n such that R(m, x). 

x6.13. For any three partial functions / o, g, h with domains and ranges 
such that the identities below make sense, there exists a least partial function 
/ : N x A — - E which satisfies the identities 

/ (0, x) =/o(x), 

f(n + l,x) = h(f(n,g(n,x)),n,x). 

x6.14. Prove that there is exactly one total function / : N x N — > N which 
satisfies the identities 

f(0,n) = f(n, 0) =0. 
f(n + L«i + 1) = f(n, m) + 1. 

Compute /( 5,23) using these identities and “explain” what f(n,m) is, for 
any n. m. 

6.38. Definition. On the set E* of strings (finite sequences) from a set E 
defined in 5.30, we define the partial functions 

head(w) = u{ 0), (6-23) 

tail(«) = (u( 1), .... u{lh(u) — 1)). (6-24) 

Notice that head(w) J, when lh(«) > 0, while tail(w) is always defined, but it is 
the empty string when lh(«) < 1. 
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x6.15. Prove that there exists a unique, total function r : E* —> E* which 
satisfies the identity 

, J u, if lh(«) < 1, 

|r(tail(«)) * (head(w)), iflh(w) > 1. 

Compute r{(a, b , c)) and describe r(u) in general. 

x6.16. Prove that for each graph G, the relation => G is the least (under C) 
transitive relation on G which includes the edge relation — > G . 

x6.17. Prove that for every graph G with edge relation — > G , the relation => G 
is the common least fixed point of the following monotone operators on the 
poset V{G x G) of all binary relations on G: 

7 ii(R) = {(x,y) | x > G y V (3 z)[x > G z&(z,y) <E P]}. 

ni{R) = {( x,y ) | x -> G y V (3z)[(x,z) £ R & z — > G v]}, 

K3(R) = {(x,y) | x -» G y V (3z)[x > G z -> G y] 

v(3z, w)[x — +<5 z&(z,w) € R&w — > G y]}. 

x6.18. Let Pi, P 2 be inductive posets and 

ni : Pi x P 2 -> Pi, 

n 2 ■■ Pi x P 2 -» P 2 

arbitrary countably continuous, monotone mappings, where Pi x P 2 is the 
product. Prove that there exist unique least, mutual or simultaneous fixed 
points x*, x 2 which are characterized by the properties: 

n\ (xf , x|) = x* , 7 r 2 (xf , x|) = x| , 

n\{y\,yi) <i yi&^iyuyi) <2 yi^x\ <1 yi&x 2 < 2 y 2 . 

The next problem is an algorithmic version of the well-known number 
theoretic result, that for any two natural numbers n, m / 0, there exist (positive 
or negative) integers a, fl such that 

gcd(/r, m ) = an + (1m. 

The proof uses some simple properties of the set of rational integers 
%={... ,-3,-2, -1,0, 1.2,3,...}. 

*x6.19. There exists exactly one pair of partial functions 

a:NxN- Z, y?:NxN — Z, 

with common domain of definition {( n,m ) | n, m ^ 0} which satisfy the 
following identities for all n, m, k > 0: 

if n ^ in. then a(n.m) = fl(m.n), 
a(n,n) = 1, /?(«,«) = 0, 

a(n. n + k) = a(n, k) — /?(«, k), p{n, n + k) = y?(/i. k). 
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It follows that for any two natural numbers n, m f 0, 

gcd (n,m) = a(n,m)n + f(n,m)m. 
x6.20. Find integers a, [1 e Z such that 

33 = 231a + 1 65/?. 

Find also integers a\, ft G Z such that 

1 = 137a-i + 997ft. 

6.39. Definition. For each finite partial function g : A — E. the neighborhood 
determined by g in the poset (A — > E) is the set 

N(g) = df {/ : A - E \g C /} 

of all extensions of g. A set G C (A — *• E) is open in the topology of pointwise 
convergence if 

f e G=^(3g, finite) [/ e N(g) C G], 

x6.21. Prove that the family of open sets in (A — - E ) defined in 6.39 is a 
topology by 4.30, and a mapping 

n : (A — E) -> (5 — M) 

is continuous in this topology by 6.25 if and only if it is continuous by 6.22. 

6.40. Definition. A subset G C P of an inductive poset is Scott open if (1) it 
is upward closed, i.e., 

x € G & x < y => y G G, 

and (2) for every non-empty chain S C P, 

sup S e G => (3.x g S)[x g G]. 

*x6.22. Prove that the family of Scott open subsets of an inductive poset P is 
a topology. 

*x6.23. Suppose P and 0 are inductive posets and n : P — > 0 is a mapping. 
Prove that 7i is continuous in the relevant Scott topologies if and only if n is 
monotone and for every non-empty chain S C P, 

7l(sup S ) = SUp7l[S']. 

Hint. Some find it helpful for this problem to first prove and use the fact that 
for every c G P, the set {x e P \ x < c} is Scott closed. 

*x6.24. Suppose A is a countable set and n : (A — >■ E) — > (B M) is a 
mapping. Show that n is continuous (by the definition in 6.22) if and only if it 
is continuous with respect to the Scott topologies in the posets (A — <■ E) and 
(B — >• M). 

The Continuous Least Fixed Point Theorem is often formulated for the 
class of directed-complete posets, particularly in Computer Science texts. 
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6.41. Definition. A subset S C P of a poset P is directed if any two members 
of S have an upper bound in S, 

x , y € S => (3z g S)[x < z&y < z]. 

A poset P is directed-complete (a dcpo) if every directed .S' C P has a least 
upper bound. 

x6.25. Every chain in a poset is a directed set, hence, every dcpo is an inductive 
poset and the least fixed point theorems hold for directed-complete posets. 

x6.26. For each A and E, the posets (A — >• E) and (A >->E) are directed- 
complete. 

x6.27. A mapping 7 1 : {A ^ E) (B M) is continuous if and only if for 
each directed S C (A — - E), 

77 (sup S ) = SUp7l[5']. 

x6.28. The product P\ x P 2 (Problem x6.3) of two directed-complete partial 
orderings is also directed-complete. 

*x6.29. Every countable, inductive poset is directed-complete. 

Actually the notions of inductive and directed-complete are equivalent; for 
a monotone mapping 71 ; P -4 Q on one inductive poset to another, the 
equation 

7r(sup5') = sup 7r[.S] (6-25) 

holds for all non-empty chains S C P if and only if it holds for all non-empty 
directed sets S C P: and the characterization of Scott continuity in x6.24 
holds whether A is countable or not. The proofs of these results are not 
elementary and require the Axiom of Choice, see Problems x9.22 - x9.25. 
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7.1. A well ordered set 17 is a poset 

U = (Field(C), <[/), 

where < v is a wellordering on Field! U), i.e., a linear (total) ordering on 
Field) U) such that every non-empty X C Field! U) has a least member. 
Associated with U is its strict ordering < v , 

x <u }’ -£=><if x <u }’ & x ± y. 

For example, the set N of natural numbers is well ordered by its natural 
ordering, and so is each of its finite initial segments 

[0, n) = {i e N | / < «}. 

A “longer” well ordered set is (N U {oo}. <'), where oo is some object not in 
N which we put after all the natural numbers, i.e., 

x <' y •<=>■ y = oo V [x, y € N & x < j]. 

As we did with arbitrary posets in the preceding chapter, we will usually 
identify U with its field, talk about the points or subsets of U, meaning the 
members and subsets of Field) U), skip the subscript in < (/ or < ( / when it is 
obvious from the context, etc. 

The most basic results of the last two chapters were all proved by some 
combination of the coupling 

definition by recursion - proof by induction. (7-1) 

In the simplest case, some function / : N — > E is defined by recursion, some 
properties of / are proved by induction and these in turn imply the theorem 
we want. Typical are the Continuous Least Fixed Point and the Schroder- 
Bernstein theorems which say nothing (explicitly) about recursion, induction 
or any functions with domain N, but whose proofs most assuredly use precisely 
these notions. We based the proof of the Recursion Theorem 5.6 directly on 
the Induction Axiom for the natural numbers. The key fact, however, which 


17 Do we dare call them wosetsl It’s not much worse than posets and it would sure save a lot of 
key strokes. 
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Figure 7.1. The low end of a long wellordering. 

can be generalized is that N is well ordered by its natural ordering. Here 
we will generalize 5.6 to a powerful Transfinite Recursion Theorem 7.24 
which justifies definition by recursion of functions / : U — > E on any well 
ordered set U . Coupled with Hartogs’ Theorem 7.34 which guarantees the 
existence of “arbitrarily large” well ordered sets, this makes it possible to apply 
the basic idea of (7-1) in situations far removed from the natural numbers. 
Typical applications are the Fixed Point Theorem 7.35 and its corollary, the 
Least Fixed Point Theorem 7.36, which is just 6.21 without the countable 
continuity hypothesis. 

7.2. A set A is well orderable if it admits a wellordering, so it is the field of 
some well ordered set (A,<). One of the chief lessons of this chapter is that 
well orderable sets behave much better than arbitrary sets: for example any 
two of them are comparable in cardinality, either A < c B or B < c A. In 
fact, every set is well orderable. Zermelo showed this in 1904, settling with 
one brilliant stroke the problem of Cardinal Comparability and a whole slew 
of related, regularity questions about arbitrary sets. We will prove Zermelo’s 
Wellordering Theorem in the next chapter, after we introduce the Axiom of 
Choice on which it is based. It is worth pointing out here, however, that 
the mathematical content of this fundamental result is just the sum of the 
Transfinite Recursion and Hartogs’ Theorems: the Axiom of Choice simply 
allows us to put the two together. 

7.3. Exercise. If C is well orderable and A < c C , then A is well orderable. 

7.4. Exercise. If C is well orderable and there exists a surjection f : C — » A, 
then A < c C , and hence A is also well orderable. 

7.5. Successor and limit points. Every well ordered set U looks at its low end 
like an initial segment of N. If it is not empty, it must have a least member 
which is typically denoted by 0 rather than _L, 

0 = Ot/ =df the least element of U. (7-2) 

It is pictured by a hollow square, the first point in Figure 7.1. Each a € U other 
than the maximum (which may or may not exist) has an element following it 
immediately, 

5(a) = Su{ a) = df min v {y e U \ x < y}. (7-3) 

The values of the partial function S : U — > U are the successor points of U. 
In addition, U may have limit points which are above 0 but not the successor 
of anything: 

Limit[/(x) 0 < a & (Mu < a) (3 v)[u <v< a], (7-4) 
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These are pictured by black boxes in Figure 7.1. The first limit point of U is 
typically denoted by 

co = cojj =df min{x G U | Limit (x)}, (7-5) 

when it exists, the points below it are the finite points and the points above it 
(including co) are the infinite points of U. If U is infinite, then the function 
n : N >— > U defined by the recursion 

7c(0) = 0[/ = the least member of U, ,, 

n(n + 1) = Su(n(rt)), 

is an order-preserving correspondence of N with the finite points of U. 

7.6. Exercise. For each subset I C U of a well ordered set U, the restriction 

x <i y x <u y & x, y G I 

of <u to I is a wellordering , so that I is a well ordered set in its own right with 
this ordering. 

7.7. Definition. A well ordered set U is an initial segment of V if Field ( C ) 
is a downward closed subset of Field) F) and < v is the restriction of < v to 
Field) U): 

UO V Field (U) C Field(F) (7-7) 

& (Vj G Field (U))(Vx < v y)[x G Field(C)] 

&(Vx , y G Field({7))[x <y y x <y y]. 

Clearly V is an initial segment of itself, the trivial one. With each r G V we 
associate the proper initial segment of points strictly below y, 

seg(j) = seg F (j) = d f {x G V \ x < v y} ^ V (7-8) 

More precisely, this is the field of seg( v), but the ordering is determined by V 
and we will talk about initial segments as if they were just sets, as usual. 

7.8. Exercise. If 0 is the least element of U, then seg(O) = 0, and if x G U has 
a successor, then 

seg(S(x)) = seg(x) U {x}. 

7.9. Proposition. A set I is an initial segment of a well ordered set U if and only 
if I = U or for some x G U, I = seg(x). 

Proof. If / ^ U, let x = min( U \ /) so that immediately, 

y G seg(x) => y < x => y G /, 

and to verify that / = seg(x), it is enough to prove 

y G / ==> y < x. 

Towards a contradiction, if y G / but y f x, then we must have x < y, which 
implies x G / because / is downwards closed, contradicting the choice of x. 
The converse is trivial. H 
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7.10. Exercise. The family of initial segments of a well ordered set U is well 
ordered by the relation C. 

The general idea is to view a well ordered set U as a generalization of the 
natural number sequence 0, 1 , 2 , . . . , possibly shorter than or of equal length 
to N, typically much longer. The particular members of U will be of little 
consequence: it is the length of the sequence in which we will be interested. We 
introduce here the general notion of isomorphism which relates posets with 
the same shape, the shape of a well ordered set being just a “length”. 

7.11. Definition. A function n : P — > Q from one poset into another is order- 
preserving if for all x, y G P, 

x <p y n(x) <q n(y)\ 

a similarity is an order-preserving bijection n : P >— » Q, and if one such exists 
we call P and Q similar, order isomorphic or copies of each other. We write 

P =o Q (371 : P >-» Q)[n is a similarity]. 

The subscript o in = 0 stands for “order type”, a fancier expression for “shape”. 
Notice that by our general convention of talking about a poset as if it were 
its field, we write n : P >— » Q for similarities instead of the more explicit 
7i : Field(P) >-* Field(0). 

7.12. Exercise. Every order-preserving n : P — > Q from one poset to another is 
monotone', but there exist monotone mappings which are not order-preserving. 

7.13. Exercise. If P and Q are linear posets, then a function f : P — Q is 
order-preserving if and only if it is strictly monotone, i.e.. 

x < P y=>f( x ) < Q f(y). 

In particular, order-preserving functions on well ordered sets are strictly mono- 
tone, and hence one-to-one. 

7.14. Exercise. For all posets P, Q, R, 

P=„ P 

P=o Q=>Q=oP, 

P= 0 Q &■ Q =o R => P = 0 R. 

7.15. Lemma. If a poset P is similar to a well ordered set U , then it is also well 
ordered. 

Proof. Given 0 f X C P, let p e V be the < (/-least element of the image 
7 z[X] and verify (easily) that x = n~ l (p) is <p-least in X, because n preserves 
the orderings. H 

We can construct explicitly some fairly long wellorderings by starting with 
N and its finite initial segments and applying repeatedly several natural op- 
erations on posets which yield well ordered sets on well ordered arguments. 
Here we look at just one of these, leaving the rest for the problems. 
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Figure 7.2. The successor poset to P and Succ(N). 

7.16. The successor Succ (P) of a poset P is obtained by adding a new point 
above all the members of P. To be specific, we can choose to add to the field 
of P the object 

tp =df r(Field(P)) (7-9) 

which is guaranteed by 3.11 to be a new element, and we set 

* — Succ(p) y ^=>df X <p y V [x g P&y = tp\V x = y = t P . (7-10) 

If P is finite with n elements, then Succ( P) has n + 1 elements, in fact, easily 
Succ([0, n)) = 0 [0, (n + 1)). On the other hand, Succ(N) is countably infinite, 
but with a different, “longer” ordering than N, as it has a maximum element 
which comes after all the natural numbers. 

7.17. Exercise. If P = 0 Q, then Succ(F) =„ Succ(£>). 

7.18. Exercise. If U is well ordered, so is Succ((7). 

Using this successor operation on posets, we can view each well ordered set 
U as a proper initial segment of another, 

U = seg succwitu) ^Succ(U). (7-11) 

7.19. Definition. A mapping n : P — > P on a poset to itself is expansive, if for 

all x € P, x < 7i(x). 

7.20. Theorem. Every order-preserving injection n : U >—> U of a well ordered 
set into itself is expansive. 

Proof. Towards a contradiction, assume that n : U >-* TJ is order-preserving 
but that for some x g U, n(x) < x, and let 

x* = min{x g U \ n(x) < x}. 

Thus, n(x*) < x* , and so n(n{x*)) < n(x*) since n is an order-preserving 
injection, which contradicts the choice of x*. H 

7.21. Corollary. No well ordered set is similar with one of its proper initial 
segments, and hence no two distinct initial segments of a well ordered set are 
similar. 

Proof. Every similarity n : U >-» seg(x) is (in particular) an order-pre- 
serving injection of U into U , so we cannot have n(x) < x, by the theorem.H 
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Because a well ordered set may have limit points in addition to its 0 and its 
successor points, it is easiest to generalize the principles of proof by complete 
induction and definition by complete recursion. 

7.22. Transfinite Induction Theorem. For every well ordered set U and every 
unary definite condition P. 

if (Vy G t/)[(Vx < y)P(x) => P(y)] then (Vy G U)P{y). 

Proof. Assuming the opposite, towards a contradiction, let 
y* =df min{y G U j (Vx < y)P(x) &^P(y)}; 
the hypothesis yields P(y*), which contradicts the choice of y*. H 

In specific cases, it is often just as easy to prove (Vy G U)P(y) by contra- 
diction rather than appeal to 7.22, in effect repeating this little argument. It 
depends on the statement to be proved and how much one is annoyed by deal- 
ing with negative statements. We will illustrate both styles. Incidentally, the 
term “transfinite” is used because U may be longer than N, but the theorem 
also holds, of course, when U is finite or similar with N. 

The next lemma is the key step in the proof of the fundamental theorem 
which follows it. 

7.23. Lemma. Suppose U is a well ordered set and h : (U — *• E) x U —> E 
maps the partied functions from U to E and U into E. It follows that for every 
t G U, there exists exactly one function 

a, : seg(f) -> E 

which satisfies the identity 

o t (x) = h{c r, fseg(x), x) (x < t). (7-12) 

Proof. By Transfinite Induction, assume that for each u < t there exists 
exactly one function o u : seg (u) — > E such that 

o u {x) = h{o u fseg(x),x) (x < u). (7-13) 

The induction hypothesis gives us nothing if t = Of/ is the least point in 17, 
but the required conclusion is trivial in this case taking <7 0 = 0. If f = Sv is a 
successor point in U, we set 

o t =o v U {{v,h{a v ,v))}\ 

now (7-13) holds for x < v by the induction hypothesis and it holds for x = v 
by the definition (since a, \ seg (?) = a v ). For the last case, when t is a limit 
point, we need a 

Sublemma. The set of functions {a u \ u < t} is a chain under C, i.e., 

x < u < v < t => a „(x) = cj v (x) . (7-14) 

Proof. Assume not and let x be least such that (7-14) fails for some u > x, 
u < v < t. This means that 

o u fseg(x) = o v \ seg(x), 
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and then by the identity which a u , cr v satisfy, 

a u {x) = h{a u fseg(x), x) 

= h{a v fseg(x), x) 

= cr v (x), 

which contradicts the choice of x. H (Sublemma) 

We now take 

G, = U { ff u | U < t}\ 

this is a function with domain seg(f) by the Sublemma, and it satisfies (7-12), 
since for each x < t, 

cr,(x) = a u (x ) for some u such that x < u < t, 

= h(cr u fseg(x), x) by ind. hyp., since u < t, 

= h{o t fseg(x), x) since a u fseg(x) = a t fseg(.x) 
by the definition of a, . 

This completes the proof of existence of a t , and its uniqueness is easily 
verified by Transfinite Induction, using (7-12). H 

7.24. Transfinite Recursion Theorem. For each well ordered set U and each 
function h : (U — E) x U — > E, there exists exactly one function f : U — > E 
which satisfies the identity 

f{x)=h{f\ seg(x),x) (x e U). (7-15) 


Proof. Consider the well ordered set Succ( U ) which has some point t = tu 
on top of U, and the extension h' : (Succ((7) — > E) x Succ(C) — > E of h 
defined by 


f h{a \ U, x), if x G U, 

(e*, otherwise, i.e.. if x = t, 


where e* is some arbitrary member of is, of no consequence. The function /?' 
has the correct domain for applying the Lemma to Succ (U) and h' , because 
se gsucc(t/)(0 = U. For the top point t, the Lemma gives a unique function 
/ = a, : U — > E which satisfies (7-12) for all x G U. H 

Perhaps the simplest, non-trivial application of Transfinite Recursion is 
the definition of transfinite orbits for mappings of an inductive poset into 
itself. We consider first the basic case of expansive mappings, defined in 7.19. 
Expansive mappings are related to the monotone mappings we studied in the 
last chapter, but the two notions do not coincide; for example, the constant 
mapping Ti->0on T’(N) is obviously monotone but not expansive, while 


n(X) 


IU{1} if 0 G X, 
X U {2} ifO i X 


is expansive but (easily) not monotone. It turns out, however, that results 
about expansive mappings can often be translated into similar results about 
monotone mappings. 
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Figure 7.3. The transfinite orbit of an expansive mapping. 

7.25. Iteration Lemma. Suppose n : P — > P is an expansive mapping on an 
inductive poset and U is a well ordered set. There exists a unique function 
a : U — > P which satisfies the following conditions : 

cr(0) = _L, 

if x = 5 ( 7 ), then cr(x) = 7i{a(y)), (7-16) 

if Limit(x), then <r(x) = supp {cr( v) | v < x}. 

In addition , this a is monotone from U to P. i.e.. 

x < y => o(x) <p o(y). (7-17) 

Proof. The conditions in (7-16) just about give a definition of a by trans- 
finite recursion, except that there is a problem in the limit case if the set 
{( r(y ) | v < xj is not a chain in P. To account for this possibility, we define 
a by appealing to Theorem 7.24 so that it satisfies the following: 

f _L, if x = 0, 

n(o(y))> if x = S{y) for some y, 

er(x) = supp {o(y) \ y < x}. if Limit(x) 

&(Vxi < X 2 < x)[<t(xi) <p u(x 2 )], 

_L. otherwise. 

where < = <u is the wellordering of U, as in the statement of the theorem. 
Sublemma. For each x £ U, 

Xi < X 2 < x=><r(x 1 ) <P cr(x 2 ). (7-18) 

Proof. Assume not and let x be least in U such that (7-18) fails. Since 
(7-18) holds vacuously when x = 0 is the least element in U , we need consider 
only two cases. 

Case 1. x = S(y) is a successor point. Assume xi < X 2 < x. If x 2 < y, we 
get cr(xi) <p cr(x 2 ) by the choice of x. The only other possibility is that X 2 = 
x, but then cr(xi) <p a{y) by the choice of x, and a{y) < n{a{y)) = o(x) 
by the expansiveness of n. Thus, (7-18) holds for x, which is a contradiction. 
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Case 2. x is limit. By the choice of x, 

x\ < X 2 < x=>ct(xi) < a(x 2 ), (7-19) 

and so in order to get a contradiction, we only need show that 

x\ < x=>a(xi) <p a(x). 

This holds because (7-19) and the definition of a also imply immediately that 
cr(x) = supp {a(y) \ y < x}. H (Sublemma) 

The result now follows directly, since the Sublemma implies in particular 
that the “otherwise” case in the definition of a never comes up. H 

The transfinite orbit a : U — > P of a mapping n : P — > P guaranteed by 
the Iteration Lemma is obviously an extension of the orbit ( n 1 — > x„) which 
we defined in the proof of the Continuous Least Fixed Point Theorem 6.21, 
at least if the well ordered set U is longer than N. It is one of the tools we will 
use in the proof of the Least Fixed Point Theorem, as follows. 

7.26. Plan for a proof. Suppose that for the given inductive poset P, we can 
construct a well ordered set U such that there exists no injection a : U >-* P. 
In particular, the transfinite orbit a : [7 — > P of 7.25 cannot be an injection, 
and so there must exist x < y such that o(x) = o(y ). The monotonicity of a 
implies that 

x < u < y => o(x) = a(u); 

x has a successor since it is not maximum in U, x < Sx < y; and, hence, 
a(x) = a(Sx ) = 7i(a(x)), 

in other words, the point o(x) is a fixed point of n. 

Thus, to prove that every expansive mapping n : P — > P on an inductive 
poset has a fixed point, it is sufficient to show that for each set P, there exists 
some well ordered set U which cannot be injected into P. This is precisely 
Hartogs’ Theorem, for which we aim next. To show it, we must study in some 
detail the question of comparability of well ordered sets as to length. 

The picture of the typical well ordered set in Figure 7.1 suggests that we 
should be able to compare any two of them, line them up side-by-side, the least 
element Of/ of one facing the least element 0 y of the other, the next Su(0u) 
facing Sy{0y), the first limit point toy (if it exists) facing co y, etc. until we 
run out of elements in either U or V. The precise version of this fact is a 
generalization of the Uniqueness Theorem for the natural numbers 5.4. 

7.27. Definition. An initial similarity 

71 : U > — » ti[U] C V 

from one well ordered set into another is a similarity of U with an initial 
segment of V. If such an initial similarity exists, we say that U is less than or 
equal to V in length, in symbols: 

U < 0 V (3/ C V)[U = 0 I]. 


(7-20) 
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Figure 7.4. Portrait of an initial similarity. 


We also write 

U < 0 V ^ df U < 0 V&Uf D V (7-21) 

By 7.9, every initial similarity n : U >— > F is either a similarity with F, or one 
with a proper initial segment of F, so that 

U < 0 V (3x G V)[U = 0 seg v(x)]. (7-22) 

7.28. Exercise. If n : U >— > F p : V >—> W are initial similarities, then so 
is their composition pn : (7 fF. 

7.29. Proposition. For all well ordered sets U,VW, 

U < 0 U, 

if [U < 0 V & V < 0 W], then U < 0 W, 
if [U <0 V&V < 0 U], then U = a V 

Proof. Only the third of these assertions needs proof and it follows from 
7.21. The composition pn of the initial similarities n : U >— > V , p : F >— > U 
witnessing the hypothesis is an initial similarity pn : U >— ► U, which if it 
were not onto, would witness that U is similar with one of its proper initial 
segments; so it is a bijection, and then n must also be a bijection. H 

7.30. Theorem. A function n : TJ — > F is an initial similarity of a well ordered 
set into another if and only if it satisfies the identity 

n{x) = minj,-{.v G F | (Vm <jj x)[n(u) <v y]} (x G U). (7-23) 

Proof. If n ; U >— ► F is an initial similarity, then it is order-preserving and 
one-to-one, so it satisfies 

(Vw < u x)[n{u) < v n{x)], (7-24) 

and hence 

2 = mindly G F | (Vm < v x)[n{u) < v y]} <v n(x). 

Since n is initial and z < v n(x), there exists some u G U such that n{u) = z. 
Assuming towards a contradiction that z = n{u) <v n{x), we infer that 
u <u x because n is an order-preserving injection; and hence n{u) <v z by 
the definition of z, which is absurd since z = n(u). 

Conversely, if n ; U — > F satisfies (7-23), then it is an order-preserving 
injection, since by (7-23), u < v x=>n{u) < v n{x). Suppose the image 
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n[U] is not an initial segment of V and choose x least in U such that there 
exists some y <y n(x), y ^ n[U], Now 7t[seg F (x)] C V by the choice of 
x; it is a proper initial segment since it does not contain y; so re[seg c/ (x)] = 
seg F (z) for some z G V, and (7-23) yields n(x) = z. Thus y <v z and 
y G segr(z) = rc[seg£/(jc)], which is absurd. 3 

7.31. Theorem (Comparability of well ordered sets). For any two well ordered 
sets U, V, either U < 0 V or V < 0 U. 

Proof. The result is trivial if V = 0, so we may assume the minimum 0 y 
exists. By the Transfinite Recursion Theorem 7.24, there exists a function 
7i : U — > V which satisfies the identity 

{ min v {y G V \ (Mu <u x)[n(u) <y y]}, 

if (3y G V)(Vu <u x)[n(u) <y y], (7-25) 

0 y, otherwise. 

In pedantic detail, we are applying here Theorem 7.24 with the mapping 
h : (U — >• E) x U — > E, defined by 

{ min F {y G V \ (Mu < v x)[p(u) < v y]}, 

if (3y G V)(Mu <u x)[p(u) < v y], 

0 v . otherwise. 

We now distinguish two possibilities. 

Case 1. For every x ^ 0 u,n(x) ^ 0 F . This means that the second case 
in (7-25) never applies, n satisfies the identity (7-23) and it must be an initial 
similarity by Theorem 7.30. 

Case 2. For some a G U, a ^ 0 u, we have n(a) = 0 y. Let a ^ 0^ be least 
in U and such that n(a) = 0 y, and consider the restriction 

p = (n fsegc/(a)) : seg v (a) -> V 

Now p satisfies (7-23), so by Theorem 7.30, it is an initial similarity ofsegc/(«) 
into V. In particular, the image /j [seg ;•(«)] = 7r[seg ( /(a)] is an initial segment 
of V; if it were proper, then 7r[seg ( /(u )] = seg F (z) for some z G V and 
(7-25) would yield n(a) = z ^ 0 y, contradicting the choice of a; hence, 
7i[seg[/(fl)] = V. Thus, V = 0 seg u(a), which gives us an initial similarity of 
V into U . H 

This fundamental theorem has a host of corollaries, some of which are 
worth listing immediately. The first one gives us an easier way to compare 
well ordered sets. 

7.32. Corollary. For all well ordered sets U, V, 

U < 0 V (37i : U >— ► V) [71 is order-preserving]. 

Proof. Suppose 7i : U >-> V is order-preserving but U V, so that 
V < 0 U. It follows that V = 0 seg[/(x) for some x by (7-22), and composing 
the order-preserving injections we get an injection p : U >-> seg F (x) which is 
still order-preserving and violates 7.20. 3 
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7.33. Corollary (Wellfoundedness of < 0 ). Every non-empty class £ of well or- 
dered sets has a < 0 -least member, i.e., for some Uo € £ and cdl U e £ , 
U 0 < 0 U. 

Proof. The hypothesis gives us some W e £ , and if W is < 0 -least in £ , 
there is nothing to prove. If not. then 7.31 implies that there exists well ordered 
sets in £ which are similar with proper initial segments of W , so the set 

J =df {x £W\(3Ue £)[U = 0 segp^(x)]} (7-26) 

is non-empty and it has a <w -least element x. By the definition of /. there 
exist some Uo € £ such that Uo = 0 seg^(x) and we claim that this Uo is 
< 0 -least in £ . To prove it. assume towards a contradiction that for some 
U G £ , Uo f 0 U; hence U < 0 Uo = 0 segiy(x); hence U = 0 seg^(y) for 
some y < w x, contradicting the choice of x. H 

Most often this is applied when £ is actually a set. a family of well ordered 
sets, but occasionally it is convenient to cite it more generally for classes. For 
example, there exists a < 0 -least well ordered set which has a limit point — 
namely Succ(N). 

After all this work, still we have not constructed any uncountable well 
ordered sets and it might appear that all our results apply only to peculiar, 
long reshufflings of N. Next comes the second basic theorem of this chapter 
which rectifies the situation. 

7.34. Hartogs’ Theorem. There is a definite operation /(A) which associates 
with each set A, a well ordered set 

X{A) = {h{A ),< xU ,)), 

such that h(A) f c A, i.e., there exists no injection n : h{A) >— » A. Moreover, 
/(A) is < 0 -minimal with this property, i.e., for every well ordered set W , 

if W f c A, then /(A) < 0 W. (7-27) 

Proof. First set 

WO(^4) =df {U | U = (Field( U), <u) is a well ordered set 

with Field) U) C A}, (7-28) 

and let be the restriction of the definite condition =„ to WO(^), 

U ~A v ^df U, V € WO (A) &U=o V 
Clearly is an equivalence relation on WO(d j. and we set 

h(A) = df |[WOU)/~4j C V(WO(A)). (7-29) 

We order the equivalence classes in h(A) by their “representatives”, 

[U/~a\ </(A) [V/~a\ ^=^df U < 0 V\ 
this makes sense because if 

[U/~ A ] = [U'/~aI [V/~ a ] = [V'/~aI and U < a V, 


(7-30) 
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then U' = 0 U < 0 V = 0 V' . The fact that < X ( A ) is a wellordering of h(A) 
follows easily from the general properties of < 0 , 7.31 and 7 . 33 . Taking the 
negation of both sides of (7-30) we infer its strict version, 

V < 0 u <=> [V/~ A ] < /U) [U/~ A ] ( U, V G WOU)). (7-31) 
The basic properties of the Hartogs operation are embodied in the following 
Lemma. For every a = [U/~ A \ G h(A). 

se 8 /(^) ( a ) = {[segc/(x)/~^] | x £ U} = 0 U. 

In particular, every proper initial segment of /(A) is similar with some well 
ordered set U G WO(^4), and every U G WO(v4) is similar with a proper, initial 
segment of /(A). 

Proof. We verify first the identity 

se g/f 4 ) ( a ) = {[segc/(x)/~^] | x G U}. 

If [i = [V/~ A ] < x (a) a > then V < 0 U from (7-31), and hence V = 0 seg v (x) 
for some x G U, so that /? = [segc/(x)/~^]. Conversely, for each x G U, 
seg v (x) < 0 U, hence, [seg K (x)/~ 4 ] < X ( A ) [U/~a] = a, again by (7-31). 

To show the similarity 

U = 0 seg /U) (a) = {[seg£/(x)/~ 4 ] | x G U}, 
define p : U — > h(A) by 

p(x) = [seg[/(x)/~^], {x G U); 
now p is a similarity of U with the image p[U], because 
x <u y segf/(x) ^seg[?(j) 

<*=>■ seg[/(x) < a seg v {y) 

A=> [segc/(x)/~ 4 ] < x(A) [seg(/(j)/~^]. H (Lemma) 
Suppose now, towards a contradiction, that there exists an injection 

n : h{A) >-+ A. 

and let B = n[h{A)\ C A be its image. The injection n copies the wellordering 
of h(A) to a wellordering of B, 

x <b }’ 7i~ l (x) < X ( A) n~\y) (x, y G B), 

so that U = ( B . < B ) is a well ordered subset of A, and by its definition, 

u =„ xU). (7-32) 

But U is similar with a proper initial segment of /{A) by the Lemma, and 
hence U < 0 /(A), which contradicts U = 0 /{A). 

To show the minimality of /(A), notice that for any well ordered set W , if 
W < 0 x(A), then W = 0 seg (^(a) for some a. = [U/~ A \, so that W = 0 U 
by the Lemma. Thus, 


if W < 0 x(A), then W < c A, 


(7-33) 
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since (the field of) U is a subset of A and similarities are injections. Taking 
the negation of both sides, 

if W f c A, then ~^[W < 0 /(A)], hence /(A) < 0 W. H 

Of course, we would like to prove that A < c /(A) instead of the timid 
x(A) f c A, and this is certainly true, but its proof depends on the Axiom of 
Choice. Wait for a bit until we finally bring the Deus ex Machina onto the 
stage. 

The annoying details of this proof are forced on us by the fact that the 
restriction < A of the definite condition < 0 to WO(A) is not a wellordering, for 
the trivial reason that it is not antisymmetric: there may exist distinct, similar 
U, V G WO(ri), in fact they always do, if A has more than one element. This 
is why we were forced to take h(A) as a set of equivalence classes rather than 
simply set h(A) = WO (A). Technically, < A is a prewellordering (ugh!) and 
it is worth recasting the argument in a different form, after introducing this 
notion. See Problems x7.17 - x7.20. 

The Hartogs operation can be used to construct general infimum and supre- 
tnum operation for families of well ordered sets (Problems x7.26 and x7.27), 
and it has many other interesting properties. We use it next to extend the 
Continuous Least Fixed Point Theorem 6.21 to discontinuous mappings. Let 
us first put down, for the record, the Fixed Point Theorem for expansive 
mappings, which we have already discussed. 

7.35. Fixed Point Theorem (Zermelo). 18 Every expansive mapping n : P — > P 
on an inductive poset has at least one fixed point, i.e., some x* € P satisfies the 
equation 

x* = n(x*). 

Proof. The argument given in 7.26 needs only some well ordered set U 
which cannot be injected into P, and U = y(P) does it. H 

7.36. Least Fixed Point Theorem. Every monotone mapping n : P — > P on an 
inductive poset has a ( unique ) least fixed point, i.e., for some x* G P, 

n(x*) = x*, 

(Vy G P)[n{y) = y => x* < y]. 

In fact, x* satisfies the following, strong minimality property: 

(Vy G P)[n(y) < y => x* < y], (7-34) 

Proof. A careful examination of the proof of the Iteration Lemma 7.25 
and the proof of the Fixed Point Theorem 7.26 reveals that exactly the same 
construction of the fixed point for an expansive mapping works and yields 


18 Zermelo did not formulate the Fixed Point Theorem in this generality, which is why it and 
many of its Corollaries have been attributed at various times to later mathematicians. But the 
famous “first proof” of the Weltordering Theorem which Zermelo gave in 1904 proves exactly this 
result, trivially restricted to the special case which interested him. 
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the least fixed point of a monotone mapping. However, it is not necessary 
to do this, as the Least Fixed Point Theorem is an easy consequence of the 
Fixed Point Theorem. The basic idea is to observe that the given monotone 
mapping n is necessarily expansive on some inductive sub-poset of P. 

Let 

Q = {x € P | x < 7 t(x) & (Vy)[7r(j) < y => x < y ]} 
and observe first that the restriction 

<q= {(x,y) | x,y € Q&x < P y} 

of <p to Q is also a partial ordering — this is automatically true for the re- 
striction of < P to any subset of P. (We skip the subscripts P and q for the 
remaining of the argument.) In addition, n[Q\ C Q. because 

x < n(x) =>7r(x) < n{n(x)), 

and for every y, 

n(y) < y &x < y => n(x) < n{y) < y, 

by the monotonicity of n. It follows that the restriction 

tiq = {(x,7i(x)) | x G Q } 

of n to Q is a mapping on Q. it continues to be monotone (of course) and 
it is also expansive, because of the definition of Q. To apply the Fixed Point 
Theorem 7.35 to Q and tiq we need the following. 

Lemma. The poset Q is inductive. 

Proof. It is enough to show that for every chain S C Q. the least upper 
bound M = sup .S' (which exists in P because P is inductive) is a member of 
Q. i.e., (1) M < n(M), and (2) for every y, n(y) < y => M < y. For (1) we 
compute: 

x G S => x < M. because M is an upper bound of S. 

=> 7r(x) < 7 z{M), because n is monotone, 

=» x < 7t(x) < n(M), because x G S C Q. 

and therefore n(M) is an upper bound of .S' and we have M = sup S < n(M). 
(2) follows from the observation that every y such that n(y) < y is an upper- 
bound of Q (from Q' s definition), and therefore an upper bound of the smaller 
set S C Q. so that M = sup S < y. H (Lemma) 

By the Fixed Point Theorem 7.35 now, there exists some x* G Q. such that 
7i(x*) = x* and (7-34) holds simply because x* G Q. 

That there is at most one least fixed point is obvious: if y* is also a least 
fixed point, then x* < y* and y* < x* , so x* = y*. H 

The full Least Fixed Point Theorem frees us from the necessity to check 
continuity in the applications of least fixed points to computer science, the 
fixpoint theory of programs. This is nice, but not very important, since (as 
we observed in Chapter 6) the mappings which come up in algorithmic appli- 
cations are typically continuous “by inspection”. However, the theorem has 
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□ o o ■ • • 

( 0 . 0 ) ( 0 , 1 ) ( 0 . 2 ) ••• ( 1 . 0 ) ( 1 , 1 ) ( 1 , 2 ) ••• 

Figure 7.5. The sum N + 0 N. 

more significant, deeper applications to the general theory of sets, particularly 
in the study of definability in set theory as well as the construction of examples 
and counterexamples with specified properties. We will encounter several of 
these in the chapters which follow. 


Problems for Chapter 7 

x7.1. Every linear ordering of a finite set is a wellordering. (See the related 
Problem x6.7.) 

7.37. The sum P + 0 Q of two posets P and Q is obtained by placing disjoint 
copies of P and Q side-by-side, every point of P preceding every point of Q. 
Formally, we set P + 0 Q = R, where 

Field(i?) = d f ({0} x Field(P)) U ({1} x Field(g)), (7-35) 

and for (z, x), (/, y) <E Field(i?), 

(i- x) < R (j, y) ^=> d f i < j V [/' = j = 0& x < v y] , , 

V [z = j = l&x < v y\. y 1 

The idea is that P is similar with the set {0} x Field(P) partially ordered 
by its second elements, by the obvious similarity (x i— > (0, x)), and (again) 
Q= 0 {1} x Field(G). 

x7.2. If P = 0 P' and Q = 0 Q' , then P + 0 Q = 0 P' + a Q' . 
x7.3. For all posets P, Succ(P) = 0 P + 0 [0, 1). 
x7.4. For all posets P.Q.R, 

P +o (2 +o P) = o (P +o Q ) +0 P- 

x7.5. If U and V are well ordered sets, then so is their sum U + 0 V. 

x7.6. Prove that [0. 1)+ 0 N = a N N+ o [0, 1), so that the addition operation 
on well ordered sets is not commutative. 

7.38. The product P 0 Q of two posets is obtained by replacing each point of 
Q by a copy of P. Formally, we let P 0 Q = R, where 

Field (R) = Field (P) x Field(2), (7-37) 

and <r is the inverse lexicographic ordering of pairs i.e. , we compare the 
second members first: for {x\,yi), (x 2 ,y 2 ) € Field(f?), 

(*i,Ti) <R (x 2 ,y 2 ) ^=^df y\ <Q y 2 v [y\ = y 2 &x 1 <p x 2 \. (7-38) 
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A : • • • • • 

A x B : •••■■ •• 



Figure 7.6. The product of two well ordered sets. 

There is no special reason for ordering pairs by looking at their second mem- 
bers first, it is just that Cantor chose to do it this way and it has stuck. 

x7.7. If P = 0 P' and Q = 0 Q’ , then P ■„ Q = 0 P’ - 0 Q’ . 

x7.8. Prove that P - 0 [0, 2) = 0 P + 0 P, but [0, 2) •„ N = 0 N N - 0 [0. 2), so 
that multiplication of well ordered sets is not commutative. 

x7.9. For all posets P.Q.R, 

P o ( Q -o R) =o (P -o Q ) -o R- 

x7.10. The product of two well ordered sets is well ordered. 

x7.11. For each well ordered set U, there exists exactly one function 

Parity : U -» N, 

such that Parity(j) = 0 if y = 0 or y is a limit point, and at successor points, 
Parity(S(x)) = 1 - Parity(x). 

x7.12. Every point y in a well ordered set U can be expressed uniquely in the 
form 

y = S’\x ), (7-39) 

where (1) x is either the minimum 0 or a limit point. (2) n is a natural number, 
and (3) the function (z, x) i-> S l (x) is defined by the recursion 

S°(x) = x, S l+1 (x) = S(S‘ (x)). 

x7.13. For any two well ordered sets U, V . there exists at most one initial 
similarity n : U >-» n[U] C V . 

x7.14. For all well ordered sets U, V, W, 

U<o V & V < 0 w => u < 0 w, 

U <o V&V<o W => U <0 w. 

x7.15. If k and a are well orderable cardinal numbers, then either k < c a or 

X < c K. 

*x7.16. If' k is a well orderable, infinite cardinal number, then k + 1 = c k. 
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( 0 , 0 ) ( 1 , 0 ) ( 2 , 0 ) ••• ( 0 , 1 ) ( 1 , 1 ) ( 2 , 1 ) ••• 

N -o [0,2): □ o o ••• ■ • • 

[0, 2) ■„ N : □ o o o o 

( 0 , 0 ) ( 1 , 0 ) ( 0 , 1 ) ( 1 , 1 ) ( 0 , 2 ) 

Figure 7.7. Multiplication of well ordered sets is not commutative. 

7.39. Definition. A prewellordering on a set A is any relation ^ C A x A which 
is reflexive, transitive, connected (total) and grounded. “Connected” means 
that any two points in A are comparable, 

(Vx, y G A)[x ^ )’ V y ^ x], 

and “grounded” means that every non-empty X C A has a ^ -least member, 
(VIC A, X^ 0)(3x G X){\/y G X)[x £ y]. 

A prewellordering would be a wellordering, if only it were antisymmetric. 

x7.17. For each set A, consider the set 

B = {X C A | X is finite} 
of all finite subsets of A. and set on B 

X^bY X < c Y. 

Prove that <b is a prewellordering. 

x7.18. A relation A C A x A is a prewellordering if and only if there exists 
a well ordered set U = (Field(C), <u) and a surjection n : A — » Field(t7) 
such that 

x ^ y n{x) <u n{y) (x,yGd). 

x7.19. For each set A, the relation 

U < A V CFG WOU) & U < 0 V 

is a prewellordering of WO(^4). 

x7.20. Rework the proof of Hartogs’ Theorem by applying the preceding two 
problems. 

x7.21. For every set A. there exists a well ordered set V such that there exists 
no surjection n : A — » V. 

x7.22. Prove that if A < c B, then %{A) < 0 x(B)- 

x7.23. Prove that /([0, n)) = 0 [0. n + 1). 

x7.24. If IF is a well ordered set and W < c A, then IF < a /(A). 

x7.25. For each set A and each well ordered set U, 

U<ox(A) <*=>• Field(C) < c A. 
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Figure 7.8. Portrait of a prewellordering. 



The operation / (A) is definite, as we gave an explicit definition of the field 
h(A) and the wellordering < x (a) of /(A) from A. We can use it to define some 
related, “infinitary” operations on families of well ordered sets. 

*x7.26. Define a definite operation inf(g’), such that for every non-empty 
family S of well ordered sets, inf(g’) has the following properties: 

(1) inf(lf) is a well ordered set. 

(2) For some U £ S, inf(g’) = 0 U. 

(3) For every U £ S, int(S) < 0 U. 

Hint. Look for inf(g’) in the initial segments of S). 

*x7.27. Define a definite operation sup(g’), such that for every non-empty 
family S of well ordered sets, sup(f’) has the following properties: 

(1) sup (S) is a well ordered set. 

(2) If U £ S, then U < 0 sup(g’). 

(3) If IF is a well ordered set and for each U £ S, U < 0 IF, 
then sup(g’) < D IF. 

*x7.28. Let < be a linear ordering of a set A and define on the poset V{A) the 
mapping 

n{X) =df {y e A I (Vx < y)[x G X ]}. 

Verify that n : V{A) — > V{A) is monotone and give an example where it is not 
countably continuous. Prove that if A w is the least fixed point of n, then 

x € Ayj <=> {(i, t) e A x A | s < t < xj is a wellordering. 

There are situations where it is easier to use the proof of the Fixed Point 
Theorem 7.35 rather than its statement. 

*x7.29. Detailed Fixed Point Theorem. For each expansive or monotone map- 
ping Ti : P — > P on an inductive poset P. there exists a subset D C P with the 
following properties: 

1. D is a well ordered chain in P. 

2. Every member of D is determined from its predecessors by the formula 

x = (sup {j ; € D | y < x}). 

3. No point in D is a fixed point of n. 
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4. The point 7r(supZ>) is a fixed point of n, 

7z(7t(supZ>)) = 7r(supD). 

5. If 7 z is monotone, then 7i(supZ)) is the least fixed point of n. 

Prove also that these conditions determine D uniquely. 

*x7.30. Suppose P and Q are inductive posets and n : P x Q —> P is a 
monotone mapping on the product and define the mapping p : Q — > P by 
appealing to Problem x6.4 and the Least Fixed Point Theorem 7.36, 

P(y) = (MX G P)[n(x, y) = x] (7 _ 4Q , 

= the least fixed point of n(x, y) = x. 1 

Prove that p is a monotone mapping, and if n is countably continuous, then 
so is p. 


*x7.31. Bekic-Scott Rule. Suppose P\ . Pi are inductive posets, and 


: Pi x P 2 -»• Pi, n 2 : Pi x P 2 -»• P 2 

are monotone mappings. Using the /^-notation for least fixed points of (7-40), 
let 

p(x 2 ) = (fixi G P\)\n\(x\,x 2 ) = x\\. 


let 

T2 = (flX2 G Pi)[n 2 (p(x 2 ),x 2 ) = x 2 ] 

be the least fixed point of the mapping x 2 >— > n 2 {p{x 2 ),x 2 ) (which is monotone 
by x7.30) and finally let 

{x*,x 2 ) = {p{x\,xi) G Pi x P 2 )[{n\(x\,x 2 ),n 2 (xi,x 2 )) = (xi,x 2 )] 

be the least fixed point in the product poset. Prove that 


x 2 = x 2 . 

The problem insures that we can compute simultaneous least fixed points by 
iterating the least fixed point operation (px G P) on one inductive poset at a 
time. 
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8.1. The Axiom of Choice, AC. For any two sets A, B and any binary relation 

P C (Ax B ), 

(V* G A)(3y G B)P(x,y)=>(3f : A -> B)(\/x G A)P(x,f(x)). (8-1) 

This is the last and most controversial axiom of Zermelo. To understand 
how such an axiom might be needed, consider the classical example of Russell, 
where A is a set of pairs of shoes, B = (J A and 

P{x,y) •<=>■ y G x. 

The function 

/ (x) =df the left shoe in x, (x G A) 

obviously selects a shoe from each pair, in symbols (Vjc G A)P(x,f(x)). If, 
however, A is a set of pairs of socks, then there is no obvious way to define 
a function / : A — > (J A which selects one sock / (jc) G x from each pair, 
because (as we stipulate for the example), a pair of socks comprises precisely 
two perfectly identical objects. We can still prove that a selector function / 
exists when A is finite, by induction on the number of elements in A (Problem 
x8.1). But in mathematics we can imagine infinite sets of pairs of socks, and 
in that case we need something like the Axiom of Choice to guarantee the 
existence of such a function. 

Less amusing but more significant for mathematics is the proof of the basic 
theorem 2.10, where we considered a sequence A 0 , Ai, ... of countable sets 
and began with the phrase 

It is enough to prove the theorem in the special case where none of 
the A„ is empty, in which case we can find for each A„ an enumeration 
n„ : N — » A„. 

Perhaps for each n “we can find” (i.e., “there exists”) some enumeration n of 
A n , but the rest of the proof needs a function (n n„) which associates a 
specific enumeration n„ with each n: which of the axioms (I) - (VI) can be used 
to prove the existence of such a function? Here A = N, B = (N — > (J“ 0 ^«) 
and 

P{n,n) n : N — » A n , 

so that (Vn G N)(37i G B)P{n,n) from the hypothesis that each A n is non- 
empty and countable: and the Axiom of Choice guarantees precisely that there 
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Figure 8.1. A selector for P C A x B. 

exists a function / : N — » B such that for each n G N, the value /(«) = n„ 
satisfies P(n, n„), i.e.. it enumerates A„. Such “silent” appeals to the Axiom 
of Choice, masked by the notation, are very common in mathematics and 
especially in analysis, where the classical theory of limits and continuous 
functions cannot be developed in a satisfactory way without choices. 

If we picture P C A x B as a subset of the product space, then the hypothesis 
(Vx G A) (3 y G B)P(x, y ) means that the fiber or section 

Px =df {y G B I p{x,y)} (8-2) 

above each x G A is non-empty; the Axiom of Choice guarantees the existence 
of a selector for P, a function / : A — > B which assigns to each x G A exactly 
one point in the fiber above it. There are two other, simple reformulations 
of the axiom which express in different ways the process of “collecting into a 
whole” any number of unrestricted, non-conflicting choices. 

8.2. Definition. A set S is a choice set for a family of sets % if (1 ) .S' C (Jg\ 
and (2) for every X G §*, the intersection S fl X is a singleton. A choice set S 
selects from each X G I? the unique member of the intersection S fl X. 

8.3. Exercise. If lb G §?, then % does not admit a choice set. Also , if a fib. then 
the family W = {{«}. {a, b}, {b}} does not admit a choice set. 

8.4. Theorem. The Axiom of Choice is equivalent to the following proposition : 
every family IS of non-empty and pairwise disjoint sets admits a choice set. 

This is the version of AC postulated by Zermelo. 

Proof. Assume first the Axiom of Choice and let U = (J W be the union 
of the given family of pairwise disjoint, non-empty sets, which means that 

(VX G %){3x G U)[x G X]. 

The Axiom of Choice guarantees that there exists a function / : W — > U , 
such that 

(VX G r)[/(X) G X]; 

we set S = /[*?] = {/(X) | X G §*}, and the fact that the members of W 
are pairwise disjoint implies easily that S intersects every member of 'S in a 
singleton. 
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For the converse, assume 

(Vx G A) (3 y G B)P(x,y), 

set for each x G A 

U x = {{t, y) G P\t = x}, 

and let 

% = {U X | x G A}. 

Each member of §T is non-empty by the hypothesis and it is determined by the 
constant, first member of the pairs in it, so any two members of W are disjoint. 
If S is a choice set for this if, then the function 

/ (x) = the unique y such that (x, y ) G S 
easily satisfies the conclusion of the Axiom of Choice. H 

8.5. Definition. A choice function for a set A is any partial function 

e : V(A) — >• A, 

such that 

&e(x) ex. 

8.6. Lemma. The Axiom of Choice is equivalent to the assertion that every set 
admits a choice function. 

Proof. For every A. obviously (VA G V{A) \ {0})(3_p G A)[y G A]; 
and so. directly from the Axiom of Choice, there must exist some function 
e : V{A) \ {0} — > A such that 

(VA G P(A) \ {0})[e(A) G A], 

The converse is easy enough to leave for an exercise. 3 

8.7. Exercise. If every set admits a choice function, then the Axiom of Choice 
is true. 

8.8. But is it true? (1) Naively understood, the Axiom of Choice asserts that 
if each of a set of non-conflicting choices is possible, then they can all be 
made independently and their results collected into a completed whole, a set. 
By this understanding it is quite obvious, it can be justified by the natural 
interpretation we would give to the Powerset Axiom: when we grant sethood 
to the class {A | A C A} of all subsets of A, we truly mean all subsets of 
A, including those for which the membership criterion is not determined by 
some explicit law but by free choice, by chance if you will. 

The Axiom of Choice is different in form from the earlier “constructive” 
axioms (II) - (VI), because it postulates directly the existence of a set for 
which it does not supply a definition. Each of (II) - (VI) grants sethood to 
a specific, explicitly defined collection of objects, it legitimizes a special case 
of the most appealing (if false) General Comprehension Principle 3.3. The 
Axiom of Choice is the only Zermelo axiom other than Extensionality which is 
not a special case of the General Comprehension Principle. This is misstated on 
occasion, to make the claim that the Axiom of Choice is the only one which 
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demands the existence of objects for which it does not supply a definition, 
which is not true: the Extensionality and Powerset Axioms do the same, in a 
more fundamental if indirect manner. 

Zermelo introduced the Axiom of Choice explicitly in 1904, in a brief paper 
in which he used it to prove that every set is well orderable. This was a 
long-standing conjecture, and Cantor had outlined a proof of it in a letter to 
Dedekind, then still unpublished. His proof, however (and the related proof 
of the Cardinal Comparability Hypothesis), depended on intuitions about 
sets which were not sufficiently explained. In contrast to this, Zermelo made 
it clear, from the start, that his own detailed proof depended on the Axiom 
of Choice, and he was immediately attacked for this by some of the leading 
mathematicians of the time, for introducing a questionable method to derive 
an implausible conclusion. Given the fact that choice principles were by no 
means new to mathematics and that they permeate Cantor’s earlier reasoning, 
it is fair to say that the shock was caused more by the realization of the power 
of the axiom than by its meaning. 

In the next, fundamental result, we establish the somewhat surprising equiv- 
alence of the Axiom of Choice with two very different set theoretic claims. We 
have kept the traditional names for these propositions — Axiom, Hypothesis, 
Theorem — which have been attached to them by the historical accident of 
when and how they were introduced in the mathematical literature. 

8.9. Theorem. The following propositions are all equivalent. 

(1) Axiom of Choice: Every set admits a choice function. 

(2) Hypothesis of Cardinal Comparability: For any two sets A, B. either 
A < c B or B < c A. 

(3) Wellordering Theorem: Every set is well orderable. 

Proof. We verify, round-robin style, that each of the first two propositions 
implies the next, and finally (3) => (1). 

(1) => (2). By Exercise 6.17, the poset (A >->• B) of partial injections from 
A to B is inductive. The idea is to define (using AC) an expansive mapping 
on (A B) which extends properly each partial injection p : A>-^ B whose 
domain does not exhaust A and whose range does not exhaust B ; any fixed 
point of that mapping then will either be defined on all of A, witnessing that 
A < c B , or its image will exhaust B , witnessing that B < c A. 

In detail, let 

e A ■ V(A) - A. e B : V(B) - B 

be choice functions on A and B. supplied by the Axiom of Choice, and define 
7i on (A B) as follows: 

[p U {(e A {A \ Domain {p)),e B {B \ Image)/))))}, 
n(p) = < ifDomain)/?) C A & Image)/)) C B. 

y p. otherwise. 
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This mapping is obviously expansive, and so by the Fixed Point Theorem 7.35, 
it has a fixed point p* : A — *■ B, for which 

either Domain(p*) = A or Image(p*) = B. 

Case 1, Domain(p*) = A. Now p* : A >— » B is an injection of A into B, 
and so A < c B. 

Case 2, Image(p*) = B. In this case, p* is a bijection of Domain(p*) with 
B, and so B = c Domain(p*) C A and hence B < c A. 

(2) ==> (3). By the Cardinal Comparability Flypothesis, either A < c h(A) 
or It (A) < c A, where/; (A) is theHartogs set for A: but h(A) -£ c HbyHartogs’ 
Theorem 7.34, and so A < c h{A)\ in addition, h{A) is well orderable, and 
hence A is also wellorderable by Exercise 7.3. 

(3) => (1). If < is a wellordering of A, then the partial function 

e{X) = the < -least member of X 

is a choice function for A. H 

8.10. But is it true? (2) The import of this theorem is that if we accept the 
basic, constructive first six axioms of Zermelo, then the Axiom of Choice, 
the Hypothesis of Cardinal Comparability and the Wellordering Theorem 
express in three different ways the same set theoretic principle. No doubt, 
the Axiom of Choice is the most direct and intuitive formulation of this 
principle, the one which makes it most obvious that it is true. The Cardinal 
Comparability Hypothesis is certainly easy to understand and plausible, but 
few would propose it as an axiom: it has the feel of a mathematical claim 
which ought to be proved. Finally, the Wellordering Theorem is crystal clear 
in its meaning and it gives a mechanism for making choices which “explains” 
in some way the Axiom of Choice; but far from being obvious, it raises a 
flag of caution. For example, what does a wellordering of the powerset of 
the natural numbers V{N) look like? Without some thought, it is not even 
obvious that T’(N) admits linear orderings (see Problem x8.9). It is quite 
difficult to imagine the structure of the beast, and this naturally casts doubt 
on the truth of the axiom which implies its existence. It is hardly surprising 
that the commotion about the Axiom of Choice was caused by Zermelo’s 
proof of the implication (1) ==> (3), whose conclusion is still thought by 
many to be counterintuitive. 

In addition to the Cardinal Comparability Hypothesis and the Wellordering 
Theorem which are fundamental for the development of set theory, the Axiom 
of Choice is equivalent with a host of other propositions which are important 
in other areas of mathematics. We include here just two of them, which are 
closest to our subject and easy to prove by the methods we have been studying, 
but there are many others. 19 


19 Kenneth Hoffman once declared in a class that the Tychonoff Theorem of general topology 
is “obviously” equivalent to the Axiom of Choice, “since all fishy, general principles are equivalent 
to AC”. 
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8.11. Theorem. The Axiom of Choice is equivalent with the following two propo- 
sitions: 

(1) Maximal Chain Principle: Every poset P has a chain S C P which is 
maximal in the sense that for every other chain S', S C S' =>■ S = S'. 

(2) Zorn’s Lemma: If every chain in a poset P has an upper bound, then P 
has at least one maximal element. 

Proof. AC => (1). Assume AC, let (Chains(P), c) be the poset of all 
chains in P, which is inductive by Proposition 6.14, and assume towards a 
contradiction that there is no maximal chain, so that by AC, 

(VS G Chains(P))(3S' G Chains(P))[S C S']; 

now AC gives us a mapping n : Chains(P) — > Chains(P) which is expansive 
but has no fixed point, contradicting the Fixed Point Theorem 7.35. 

(1) =>• (2). If S is a maximal chain in P and M is an upper bound of 
S. both guaranteed by the hypotheses, then M is maximal in P — because if 
M < P M', then S U |M'} would be a chain which extends S strictly. 

(2) ==> AC. For any two sets A. B, consider the poset (A >->• 5) of partial 

injections. This is inductive, by Exercise 6.17. and so every chain has an upper 
bound in it; by Zorn’s Lemma then, (A >-^ B) has a maximal element /, which 
(as in the proof of (1) => (2) in Theorem 8.9) establishes that either A < c B 
or B < c A, which in turn implies AC. 3 

We now consider two easy corollaries of the Axiom of Choice which express 
simpler principles of choice. 

8.12. Countable Principle of Choice, ACj i. For each set B and each binary 
relation P C N x B between natural numbers and members of B, 

(V/t G N)(3v G B)P(n,y)=>(3f : N — > B)(\/n G N)P(n, /(«)). 

8.13. (VII) Axiom of Dependent Choices, DC. For each set A and each relation 
P C A x A, 

a G A& (Vx G A)(3y G A)P(x, y) 

=> (3/ : N — > A)[f{ 0) = a & (V/t G N)P(f(n),f(n + 1))]. 

In contrast to the full Axiom of Choice which demands the existence of 
choice functions / ; A — > B for arbitrary A, B, the Countable Principle of 
Choice ACn justifies only a sequence of independent choices from an arbitrary 
set B which successively satisfy the conditions 

m/( 0)), P(l,/(1)), -P(2, / (2)), • • • 

The Axiom of Dependent Choices DC also justifies only a sequence of choices, 
where, however, each of them may depend on the previous one. since they must 
now satisfy the conditions 

f>(/(0),/(l)), P(/(l),/(2)). P(f (2) , / (3)) , • • • 

It is easily equivalent to the following, seemingly stronger principle which 
allows each choice to depend on all the preceding ones. 
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8.14. Proposition. The Axiom of Dependent Choices is equivalent to the follow- 
ing proposition : for every set A and every relation P C A* x A between strings 
from A and members of A, 

(Vw G A*)(3x G A)P(u,x)=>(3f : N — > A)(Vn)P(f(n ), /(«)). 

Proof. The implication from this version of DC to the “official” one is easy 
and we leave it for an exercise. Assuming now DC and the hypothesis of the 
seemingly stronger version, define on A* the relation 

Q(u.v) <t=>df (3 a G A)[v = u* (x) &P(u, a)]; 

we obviously have (Vw G A*)(3v G A*)Q{u,v), DC gives us a function 
g : N — > A* such that g(0) = 0 and (Vn)Q(g(n),g(n + 1)), and the function 
we need is / = (Jg, for which f(n) = g(n + 1 )(«) and f(n) = g(n). 3 

8.15. Exercise. Show the direction of Proposition 8.14 omitted in the given 
proof 

8.16. Theorem. (1) The Axiom of Choice implies the Axiom of Dependent 
Choices. 

(2) The Axiom of Dependent Choices implies the Countable Principle of 
Choice. 

Proof. (1) Let e : P(A) \ {0} —> A be a choice function for A and assume 
the hypothesis of DC. The function / : N — > A that we need for the conclusion 
is defined by the recursion 

/( 0) = a. 

f(n + 1) =e({y G A | P(f(n),y)}). 

(2) Assume the hypothesis of the Countable Principle of Choice, let A = 
N x B. let a = (0, b) where b G B is any point satisfying P{ 0. b ), and define 
on A the relation 

Q((n, a), {m, y)) ^=4>df m = n + 1 &P(m, y). 

The function / : N — > N x B supplied by DC for this a and Q takes pairs as 
values, so /(«) = (. g(n),h(n )), g(0) = 0, h{ 0) = b for suitable functions g. 
lu and for every n. g(n + 1) = g(n) + 1, P{g{n + 1 ),h(n + 1)). It follows 
that for every n, g{n) = n and P{n. h(n)), as required by the conclusion of 
the Countable Principle of Choice. H 

We need a definition to formulate the most useful version of the Axiom of 
Dependent Choices. 

8.17. Definition. A graph (G, — >g) is grounded or well founded if every non- 
empty subset of G has a minimal member, i.e., 

if 0 f X C G, then (3m G X)(Vx G X)[m -f G a], (8-3) 

A poset ( P . <) is grounded if the associated “inverse strict graph” (P. >) is 
grounded, which means that for every X, 

if 0 f X C P. then (3m G X)(Va G X)[x < m=> x = m]. (8-4) 



116 


Notes on set theory 


8.18. Exercise. Prove that a linear ordering ( P, <) is grounded if and only if it 
is a wellordering. 

8.19. Proposition. The Axiom of Dependent Choices is equivalent to the fol- 
lowing proposition'. a graph G is grounded if and only if it has no infinite, 
descending chains, i.e.. there exists no function f : N — > G such that for all n, 
/(«) > G /(« + 1), 

/( 0) — /(l) — /(2) -> G ... • 

Proof. First assume DC. If / : N — > G is an infinite, descending chain, 
then the set {/ ( n ) | n G N} has no minimal element, so G is not grounded. 
Conversely, if G has a non-empty subset X with no minimal element, then 
(Vx G X){3y G X)[x — y], and then DC gives us an infinite descending 
chain, starting with some arbitrary a G X . 

Assume now that every graph which has no infinite descending chains is 
grounded and the hypotheses a £ A and 

(Vx G A)(3y G A)P(x,y) 
of DC holds. Consider the graph (A, —> A ) where 

x -> A y <==> df P(x, y) (x, y G A). 

The conclusion of DC is exactly the statement that (A,—> a ) has an infinite 
descending chain which starts with a, so if it fails, there must exist some min- 
imal m G A: this means precisely that (Vy G A)-<P(m,y), which contradicts 
the hypothesis of DC. H 

Grounded graphs have many of the properties of well ordered sets: in 
particular, we can prove propositions by induction and define functions by 
recursion over them, cf. Problems x8.10, x8.11 and Theorem 11.5. The easy 
direction of this result makes DC particularly useful in studying them, as it is 
often simpler to verify that a given graph G has no infinite descending chains 
than to prove directly that it is grounded. 

8.20. But is it true? (3) We have remarked that before it was formulated pre- 
cisely by Zermelo, the Axiom of Choice had been used many times “silently” 
in classical mathematics, and in particular in analysis. These classical appli- 
cations, however, can all be justified on the basis of the Axiom of Dependent 
Choices — in fact most of them need only the weaker Countable Principle 
of Choice. This will become clear in Chapter 10 and Appendix A. Zer- 
melo assumed the full Axiom of Choice because it is a natural hypothesis 
in the context of Cantor’s set theory; because it is needed in the proofs of 
the Wellordering Theorem and the Cardinal Comparability Flypothesis; and 
because it is indispensable for the development of cardinal arithmetic. This 
difference between the choice principles needed for classical mathematics and 
those required by Cantor’s new theory of sets explains in part the strident 
reaction to the axioms of Zermelo by the distinguished analysts of his time 
(including the great Borel), who had used choice principles routinely in their 
work — and continued using them, as they denounced general set theory and 
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called it an illusion: in the context of 19th century classical analysis, the 
Axiom of Dependent Choices is natural and necessary, while the full Axiom 
of Choice is unnecessary and even has some counterintuitive consequences, 
including certainly the Wellordering Theorem. 

We should also mention here that even in general set theory where the 
full Axiom of Choice is routinely accepted as obvious, many of the basic 
theorems do not need it, and in particular all the results of Chapter 2 can be 
based axiomatical!}’ on the Axiom of Dependent Choices. Notice also that we 
proved all the basic facts about well ordered sets in the preceding chapter with 
no appeal to choice principles whatsoever. For this reason, we will deviate 
technically from Zermelo and we will put in our basic system the Axiom 
of Dependent Choices instead of the full Axiom of Choice. “Technically”, 
because we adopt the view that there is no doubt about the truth of the Axiom 
of Choice and we will never hesitate to appeal to it when it is needed: we will 
simply include it (discreetly) among the hypotheses. 

8.21. The axiomatic theory ZDC. The axiomatic system ZDC comprises the 
constructive axioms (I) - (VI) of Chapter 3 and the Axiom (VII) of Dependent 
Choices 8.13, symbolically 

ZDC = (I) - (VI) + DC = (I) - (VII). 

From now on and until Chapter 11, we will use in proofs the axioms of ZDC 
without explicit mention. When the Axiom of Choice is required, we will 
make a note of the fact by annotating the relevant proposition with the mark 
(AC). In Chapter 11 we will complete our axiomatization by adding to ZDC 
the Axiom of Replacement . 

8.22. Consistency and independence results. Could we settle the controversy 
about the Axiom of Choice by simply proving or refuting AC from the con- 
structive axioms (I) - (VI)? Neither possibility seems likely. On the one hand, 
AC is probably true, as are axioms (I) - (VI), and we cannot refute a true 
statement on the basis of true assumptions. On the other hand, AC appears 
to be a genuinely new set theoretic principle, and we cannot expect to prove 
it from the other ones, by logic alone. As a matter of fact, it can be shown 
rigorously that the Axiom of Choice can neither be proved nor refuted from 
axioms (I) - (VI). 

The most direct way to show that a certain proposition <j> cannot be proved 
in a certain axiomatic system T is to produce a model of T in which <f> is false. 
Consider the classical problem about plane Euclidean geometry, whether the 
Parallel Axiom 20 can be deduced or not from the others. To show that it 
cannot, we declare that by “plane” we will mean the two-dimensional sphere, 
the surface of the unit ball, and by “line” we will mean any great circle on the 


20 The Parallel (fifth) Axiom of Euclid is equivalent to the assertion that given a line L and a 
point P not on it. there exists exactly one line L' through P and having no points in common 
with L. 
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sphere. The remaining primitive notions of plane Euclidean geometry can be 
defined naturally in this interpretation, and it is not hard to verify that the 
basic, simple axioms of Euclid are true with these definitions; thus, we have 
a model of plane geometry in which the Parallel Axiom fails, simply because 
any two great circles intersect. We conclude that the Parallel Axiom cannot 
be proved from the others “by logic alone”, because then it would be true in 
every interpretation which makes the other axioms true, and we have found 
one where it is false. 

To define a model for an axiomatic theory, in general, one needs to specify 
a domain of objects and interpret on it the primitives of the theory, so that the 
axioms are true. For a theory about sets, this means we must define sethood 
and membership on some domain, and we must also identify which conditions 
and operations on the domain will be considered definite. Now, ZDC is a 
very strong theory, and models for it do not come cheap; we will study some 
very special ones (“set universes”) in Appendix B, but the most interesting 
constructions require delicate methods from mathematical logic which are 
outside the scope of these Notes. Here we will just state and discuss some of 
the many famous consistency and independence results of the subject as they 
become relevant in what follows. 

We have assumed at the outset, in 3.6, that our theory has a model, the 
standard universe of objects W, in which axioms (I) - (VI) (at least) are true. 
This assumption is natural and even necessary if our lives as set theorists are to 
have any meaning, but it is not included among the axioms of ZDC or any of 
the stronger theories we will consider. 21 It is almost never needed, except when 
we assert the existence of models of extensions of ZDC: to construct those, we 
have to start with something, and that is always the assumed, standard model 
of our theory. 

8.23. Proviso for model existence assertions. Without further mention, all 
claims in these Notes of existence of models, consistency of theories and 
independence of propositions are based on the existence of a model which 
satisfies axioms (I) - (VI) and (VIII), the Axiom of Replacement, which we 
will introduce in Chapter 11. 

8.24. The consistency of the Axiom of Choice (Godel, 1939). Zermelo’s theory 
ZDC + AC with the full Axiom of Choice has a model, and hence (I) - (VI) do 
not refute AC, or AC is consistent with (I) - (VI). Godel’s famous model L 
of constructible sets has many more canonical properties and it witnesses the 
consistency of AC with theories much stronger than (I) - (VI). We will come 
back to it on several occasions. 


21 In fact it is not possible to assume such an axiom: adding the existence of a model of ZDC 
to the axioms of ZDC creates a new and stronger theory ZDC' and the further problem whether 
ZDC' has a model. In the most famous result of Mathematical Logic, Godel proved (rigorously) 
that this conundrum cannot be avoided: there exists no axiomatic theory (consistent and worth 
studying) which includes among its axioms or theorems the assertion that it possesses a model. 
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8.25. The independence of the Axiom of Choice (Fraenkel-Mostowski, 1939, 
Cohen, 1963). Each of the theories 

(I) - (VI) + -AC N , (I) - (VI) + AC N + -DC, ZDC+ -AC 

has a model. This means that we cannot prove AC N from the constructive 
axioms (I) - (VI), we cannot prove DC from the constructive axioms and 
ACn, and we cannot prove AC in ZDC: each of these three choice principles is 
stronger than the preceding ones. The early model constructions of Fraenkel 
and Mostowski either contained atoms or had some other, technical defects 
which limited the possibility of generalizing them. Cohen constructed his 
models by his famous forcing method, which he (and others) also used to 
establish many more unprovability results. We will refer to it several times in 
the remainder of these Notes. 


Problems for Chapter 8 

Let us call two propositions f and i// constructively equivalent if their equiv- 
alence <j> <£=> y/ can be established on the basis of the constructive axioms 
(I) - (VI), i.e., without appealing to any choice principle whatsoever. 

x8.1. Prove the Axiom of Choice (8-1) for finite A. 

x8.2. The Axiom of Choice is constructively equivalent with the following 
proposition: for every i / I and every / : A — > B. there exists some 
g : B — > A such that for all x G A, f {g{f (x))) = / (x). 

x8.3. The Axiom of Choice is constructively equivalent with the following 
proposition: for each I and each indexed family of sets (i i— > A,-) on 7, 

(Vi G I)[Aj f 0]=>n jeI Ai 0- 

The Countable Principle of Choice is constructively equivalent with the propo- 
sition: for every sequence of sets (n i— > A„), n G N, 

(V/2 G N)[A„ f 0] => n»eN^« 7^ 0- 

Combined with Problem x5.28, the next problem gives a formulation of 
the Countable Principle of Choice ACn directly in terms of the membership 
relation, with no reference to N or the concept of “function”. 

x8.4. The Countable Principle of Choice ACn is constructively equivalent with 
the following proposition: every countable, infinite family IS of non-empty 
and pairwise disjoint sets admits a choice set. 

In the next four problems we establish that the Axiom of Dependent Choices 
is equivalent with several seemingly weaker principles of choice. 
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*x8.5. The Axiom of Dependent Choices is constructively equivalent with the 
following proposition: for every non-empty A and every relation P C A x A, 

if (Vx £ A) (By £ A)P(x,y), 

then (3 B C A)[B ± 0&(3/ : B -> B)(\/x £ B)P(x, f(x))]. 

*x8.6. The Axiom of Dependent Choices is constructively equivalent with the 
following proposition: for every relation P C A x A and a £ A, 

if (Vx £ A)(By £ A)P(x,y), 

then (3 B C A)[a £ B &(3f : B — > B)(\/x £ B)P(x, f(x))]. 

*x8.7. The Axiom of Dependent Choices is constructively equivalent with the 
following proposition: a poset P is grounded if and only if it has no infinite, 
descending chains, i.e., if for every / : N — > P, 

(V« £ N)[/(« + 1) < /(«)] => (3 n)[f(n + 1) = /(«)]. 

It is also possible to formulate the Axiom of Dependent Choices directly in 
terms of the membership relation, but not in a very pretty manner. 

*x8.8. Prove that the following proposition is constructively equivalent with 
the Axiom of Dependent Choices: for every set A and every binary definite 
condition P, 


if 


0 £ A& (Vw, v £ A)[P(u, v ) => (3!x)[v = u U {x}]] 

&(Vm € A)(3v £ A)P(u,v ) , 
then (35 C A)[0 £B&(\/u£ B)(3\v £ B)P(u,v)]. 


*x8.9. Prove that the following, lexicographic ordering on (N — > Mj is indeed 
a linear ordering but not a wellordering: 

f <g <=>df / = gV (3/i £ N)[(V/ < «)[/(/) = g(i)]&f(n) < g(n)]. 
Infer that ^(N) admits a linear ordering. 

x8.10. Grounded induction. For every grounded graph G and each unary 
definite condition P, 

if (Vx £ G)[(Vy)(x — y => P(y)) => P(x)], then (Vx £ G)P(x). 
*x8.11. For every grounded graph G and every function 

h : (G — E) x G -► E, 

there exists a unique (total) function / : G — > E which satisfies the identity 
fix) = h(f \{y £ G \ x -> G y}, x) (x £ G). 

Hint. Rework the proof of Theorem 7.24, using functions 
o t : {x £ G | t =>c x} — > E 

defined on “initial segments” of the transitive closure => G of — >g- 
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We will begin this chapter with a few results about countability whose proofs 
illustrate the difference between ACm. DC and AC, but our main task is to 
establish some important consequences of the full Axiom of Choice, includ- 
ing the basic laws of cardinal arithmetic. The telltale mark (AC) will grace 
practically all the numbered propositions. 

9.1. Theorem. Every infinite set has a countable , infinite subset, and so for every 
cardinal number n, either k < c Ho or Ho < c k. 

Proof. If A is infinite, obviously 

(Vw £ A*)(3y £ A)(Vi < lh(u))[u(i) ^ y]. 

It follows from DC that there exists a sequence / : N — > A such that 
(Vn)(Vi < n)[f(i) ± /(«)], 


and the image /[N] is a countable, infinite subset of A. The second assertion 
is trivial, taking cases on whether k is finite or infinite. H 

The point of the second assertion of the theorem is that while the general 
property of Cardinal Comparability requires the full Axiom of Choice, the 
special (and significant) case of comparability with H 0 is a theorem of ZDC. 
In fact, it is possible to prove Theorem 9.1 using the Countable Principle 
of Choice ACn instead of DC, but the proof is somewhat more technical, 
cf. Problem x9.1. This is a general fact about the relation between DC and 
ACm : many results whose natural proofs call for DC follow from the weaker 
principle, with some additional effort. 

Theorem 9.1 also settles the relation between infinite and Dedekind-infinite 
sets. 

9.2. Corollary. A set A is finite if and only if it is Dedekind-finite by 4.31, i.e., if 
there exists no injection n : A >-* B C A from A into one of its proper subsets. 

Proof. Finite sets are Dedekind-finite by the Pigeonhole Principle. 5.27. 
If A is infinite, let / : N >— ► A enumerate without repetitions some infinite, 
countable subset of it. The injection 


n{x) 


f{n + 1) if for some n. x = f{n), 
x if x f f[N] 
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Figure 9.1. Non-empty tree. 

witnesses that A is Dedekind-infinite, since n[A\ = A \ {/ (0)}. H 

Next we consider an elementary but very useful result about trees, whose 
proof offers an additional illustration of the use of DC and its relation to AC N . 

9.3. Definition. A tree 22 on a set E is any set T C E of strings from E which 
is closed under the relation of initial segment, 

«c«er => u g t. 

By (5-16), for strings, u C v u C v. 

A lot of terms are used in the study of trees, most of them deriving from our 
picturing trees as, well, trees. The members of T are its nodes or finite branches, 
and every non-empty tree has 0 as its least node, the root. If u * (x) G T, then 
u is a parent of u * (x) and u * (x) is a child of u in T. Each node other than 
the root has exactly one parent, but may have many children; if it has none, it 
is a terminal node or leaf. With each node u we associate the subtree 

T u =df {w G T | w C u V u C w} (9-1) 

of nodes comparable with u. Easily, 

T u = {w | w C u} U U {T v | v is a child of u}. (9-2) 

9.4. Exercise. Show (9-2). 

The infinite branches of a tree are its infinite sequences, and we collect them 
in the body of T , 

m =df {/ : N E I (V«)[J (n) G T]}. (9-3) 

Every infinite branch of a tree involves an infinite number of distinct nodes, 
so finite trees have empty bodies. It is also easy to construct infinite trees with 
empty bodies: 

9.5. Exercise. Show that the tree 

T = {u G N* | (Vi < lh( M ), i > 0 )[u(i - 1) > »(/)]} 
on the natural numbers is infinite but has no infinite branch. 


"Trees occur in many branches of mathematics, differently defined depending on the special 
needs of the field. The present definition is the most general we will need in these Notes. 



Chapter 9. Choice’s consequences 


123 


9.6. Definition. A tree T is finitely branching if every node of T has at most 
finitely many children. 

Notice that the tree in Exercise 9.5 is not finitely branching (at the root), 
and it could not be, by the following, basic result. 

9.7. Konig’s Lemma. Every infinite, finitely branching tree has at least one 
infinite branch. 

Proof. Suppose T C E* is infinite, finitely branching, and let 
S =df {u G T | T u is infinite}. 

This is the subtree of those nodes in T which are comparable with infinitely 
many nodes. Since T 0 = T is infinite by hypothesis, the root 0 G S, and (9-2) 
implies that S has no leaves, 

(Vm G S')(3.v G T)[u*{x} G 5]. 

because each u has at most finitely many children and the infinite set S u cannot 
be a finite union of finite sets. By the strong version of DC in Proposition 8.14, 
there exists some / : N — > E such that for every n. f ( n ) G S and fin + 1) is 
a child of / («), so that / is an infinite branch of S — and hence of T. 3 
Kdnig’s Lemma is very useful, especially in the following, more “construc- 
tive” version. 

9.8. Definition. A set of nodes B C T is a bar in a tree T . if every infinite 
branch of T passes through at least one node of B, 

(V/ G [T])(3n)[7(n) G B]. 

9.9. Fan Theorem. If T is a finitely branching tree and B is a bar for T . then 
there exists a finite subset 

Bq = {u\ u„} C B 

which is also a bar of T . 

Proof. Let Bq comprise the minimal members of the bar B, 

Bo =df {u G B | (Vu ^u)[v £ 5]}. 

and notice that B 0 is also a bar. because if / G [T] and n is least such that 
f(n) G B, then /(«) G B 0 . Let S be the tree of all initial segments of the 
nodes in B 0 . 

S =df {v £ T | (3m g -6o)[u E m]}. 

Now S is a finitely branching tree (a subtree of T), and its leaves are precisely 
the nodes in B 0 , because no member of B 0 is a proper initial segment of 
another. Thus, S cannot have an infinite branch, since B 0 is a bar for T. By 
Konig’s lemma then, S is finite, and so its subset Bo is also finite. 3 

The surprisingly simple proof of Konig’s Lemma is typical of arguments 
from DC. partly because its basic structure calls for DC. but also because of 
the following two reasons: 
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(1) Kdnig’s Lemma can be proved for every tree T on a well orderable set E 
with no use of choice principles, Problem x9.3. In many applications, E = N 
or E is finite, and then we need no choice hypotheses whatsoever. 

(2) Like 9.1, Konig’s Lemma can be proved by appealing to ACu rather 
than DC, Problem x9.4. 

Many of the applications of the full Axiom of Choice have the following 
form: first we state and prove in ZDC (or even with no choice at all) some 
interesting proposition about well orderable sets, and then we infer the result 
we want for all sets by appealing to the Wellordering Theorem. Typical is the 
following generalization of the Hypothesis of Cardinal Comparability where 
(for the first and last time) we will state separately the corollary about all sets. 

9.10. Theorem. Well founded ness of < ( . (1) For every non-empty class % of well 
orderable sets, there exists some Aq € IS such that for every A e IS , Aq < c A. 

(2) (AC) Every non-empty class 'S of sets has a < c -least member. 

Proof. By 7.33, let Go = (Aq, < 0 ) be a < 0 -least well ordered set with field 
in IS . If A e g 3 , then there exists some wellordering < of A and by the choice 
of Co, (A o, < 0 ) < 0 (A, <), so that, in particular, A 0 < c A since every initial 
similarity is an injection. H 

9.11. Lemma. The next cardinal. For every well orderable cardinal number k, 
the cardinal 

« + =df|x(«OI (9-4) 

is also well orderable and it is least among the well orderable cardinals bigger 
than n, i.e., 

k < c k + , n < c X=>n + < c X, (9-5) 

for every well orderable cardinal X. Here %{k) is the Hartogs well ordered set 
of n, defined in 7.34. 

Proof. Since k + is well orderable, it is comparable with k, and it cannot be 
< c k by Hartogs’ Theorem 7.34, so k < c k + . The minimality of /(k) implies 
the rest. H 

We set 

Hi = d f Ko + . H 2 = df H+,... . (9-6) 

9.12. Exercise. (AC) Since ( with AC) every two cardinal numbers are compara- 
ble, the Continuum Hypothesis CH and the Generalized Continuum Hypothesis 
GCH can cdso be expressed by simple equations of cardinal arithmetic : 

CH 2 H °= c H!, GCH (Vre > c H 0 )[2 K = c k + ]. (9-7) 

This, unfortunately, does not help their resolution. 

The next Lemma is often useful in arguments about well orderable sets. 

9.13. Definition. A best wellordering of a set A is any wellordering < of A in 
which every initial segment is smaller in cardinality than A, 

(Vx € A)[seg(.v) < c A], 
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9.14. Lemma. (1) Every well orderable set admits a best wellordering. 

(2) If <a, <b are best wellorderings of A and B. then 

if A = c B. then (A,< a ) = 0 (B. < b ). 

In particular, any two best wellorderings of the same set are similar. 

Proof. (1) Let U = (A,<) be < 0 -least in the class of all well ordered sets 
with field A, and suppose (towards a contradiction) that there exists some 
x € A, A < c segu(x). This yields an injection n : A >-> segc'(x) and the 
relation 

u <' v ^=>df tt(m) < n{v) ( u , v € A) 

is evidently a wellordering of A which is < 0 seg v (x) by 7.32. hence < 0 U, 
contrary to the choice of U. (2) Suppose U = (A,< a ), V = (B,<b) and 
(towards a contradiction) U = 0 segj/(x j for some x G B. The similarity 
7i : A > — » seg^(x) witnesses that A = c segy(x) < c B. which is contrary to 
the hypothesis A = c B. H 

Every best wellordering of a countable, infinite set is similar with the natural 
wellordering of N, and we can use best wellorderings to show that many 
properties of countable sets hold for all well orderable sets. Typical is the next 
result, which generalizes the identity H 0 2 = c H 0 and shows that the transfinite 
arithmetic of binary addition and multiplication is trivial. 

9.15. Lemma. For every infinite, well orderable set C , C x C — c C. 

Proof. Assume the contrary towards a contradiction, let C be a < ( -least 
counterexample by 9.10, and let < be a best wellordering of C. By the choice 
of C , for every infinite point x G C, 

|seg(x)| + |seg(x)| = c 2 • |seg(x)| 

< c |seg(x)| • |seg(x)| = c seg(x) < c C. (9-8) 

The key step in the proof is the following definition of a new wellordering 
of the product C x C, due to Godel, which we have already met (somewhat 
concealed) in the proof of 5.32. We set 

< g (x 2 ,y 2 ) ^=bif [max(xi,ji) < max(x 2 , yi)] (9-9) 

V[max(xi, yi) = max(x 2 , yf ) &xi < X 2 ] 
V[max(xi, yi) = max{x 2 , yf) &. x\ = X 2 

& v 1 < yil 

The maxima here are obviously computed relative to the ordering <. 
Sublemma. The relation < g is a wellordering of C x C. 

Proof. It is easy to verify that < g is a linear ordering. To show that it is 
a well ordering, suppose X C C x C is not empty: let w* be < -least such 
that for some (x, y) € X. max(x, y) = w *; next let x* be <-least such that 
for some y, (x*,y) G X and max(x*,y) = w*; and finally let y* be <-least 
such that (x*,y*) G X, max(x*,y*) = w*. It follows that (x*,y*) is the least 
member of X. H (Sublemma) 
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(a, b) {b.b) 

• • 


seg(6) x seg{b) 


• (a, a) 



{a, b ) 


Figure 9.2. Initial segments of the Godel wellordering. 

The well ordered sets (C, <) and (C x C, < g ) are < 0 -comparable and by 
the choice of (C, <), (C x C, < g ) < 0 (C, <) is not possible, so we must have 
C < 0 C x C : thus there exists some pair (a, b ) of members of C such that 

(C,<) = 0 seg cxc((a,b)) = seg g ((a,b)), 

and we will reach the desired contradiction if we can show that the initial 
segment seg^ ((a, b)) < c C. We consider the possibilities arising from the 
relative positions of a and b in <, and we use the fact that the point max(«. b) 
must be infinite. 

Case 1, a = b. From the definition of the Godel wellordering, 

(u,v) < g (a.a) [u < a &v < ci\ V [u < a &v = a] V [u = a &v < a], 

so that 

seg g ((a, a)) = (seg (a) x seg(a)) U (seg (a) x {a}) U ({a} x seg(a)), 

and by repeated applications of (9-8), 

|seg g ((a, «)) | < c |seg(u)| 2 + |seg(a)| • 2 < c |seg(a)| • 3 < c C. 

Case 2, a < b. Nowmax(u,b) = b, 

(: u , v) < g ( a , b ) [u < b & v < b] V [u < a & v = b], 

so seg((a, b )) = (seg(b) x seg (b)) U (seg (a) x {b}) and a similar computation 
shows again that |seg i ,((u, b))| < e C. 

Case 3, a > b. This time 

(u,v) < g (a,b) <==>■ [u < a &v < a] V [u < a &v = a] V [u = a &v < b], 

from which we reach a contradiction as in the preceding cases. H 

9.16. Theorem (The absorption laws). If k and X are non-zero , well orderable 
cardinals and at least one of them is infinite , then 

k + X = c k ■ X = c max(ft, X). 
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Proof. Assuming that 0 < c k < c A and using the result X • a = c X from the 
Lemma, we compute 

X k T X ^ c k * X X * X ~c X. H 

9.17. Corollary. (AC) Lor every indexed family of sets ( i i— > and every 

infinite k, if\I\ < c k and for each i £ I , k,- < c k, then Y2iei K > — c K - 

Proof. Using AC and the hypothesis, choose for each i £ / some injection 
7 ij : Kj >— ► k, so that the mapping ((/, x) i— > (i,7ti(x))) is an injection of 
{(/, x) | i £ I &x £ k,} into I x k. Thus 

J2iei K < = r {(h^) | i e I &x e «/} < c |/ x k| = c j/| • |/c| = c k. H 

To find interesting problems and results in cardinal arithmetic we must 
consider operations with infinitely many arguments, of which the simplest are 
the following. 

9.18. Cardinal Minimum Lemma. There is a definite operation inf (S’), such 
that for each non-empty family g 3 of well orderable sets, the value n = inf c (lf) 
has the following properties'. 

(1) k is a well orderable cardinal number. 

(2) For some A € g\ k = c A. 

(3) For every B £ <g , k < c B. 

In addition, these conditions determine the value inf c ) up to = c , i.e., if k is 
any set which satisfies (1) - (3), then k = c inf c ) . 

Proof. If the cardinal assignment \X\ is strong by 4.21. then 9.10 implies 
that there exists exactly one cardinal number which satisfies the condition 

Least {%,k) ^ (BA £ g)[(fiB £ W)[A < c B] &k = \A\], 
and we can set 

ini c (S’) = min(g’) = the unique n such that Least(g\ k). 

We need to do more, since we have only assumed that \X\ is a weak car- 
dinal assignment and there may well exist many values of k which satisfy 
Least(lf , k). 

By the Lemma in the proof of Hartogs’ Theorem 7.34, if A C |J and 
< is a wellordering of A. then the well ordered set U = (A, <) is similar 
with some proper initial segment of W = /QJ?’), and hence every A £ <g is 
equinumerous with some proper initial segment of W . Thus we can set 

w =df the least x £ W such that (BA £ <T)[A = c seg W '(x)]. 

inf e(f’) =df |seg,. F (w)l- 

Verification of the required properties of inf f . (T ) is quite easy. H 

9.19. Exercise. Show the part of the theorem which follows the “in addition” . 

9.20. Cardinal Supremum Lemma. There is a definite operation sup f (W), such 
that for every non-empty family % of well orderable sets, the value k = sup f (W ) 
has the following properties'. 
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(1) k is a well orderable cardinal. 

(2) For every A £ ? , A < c k. 

(3) If B is well orderable and for all A G If, A < c B, then k < c B. 

In addition, these conditions determine the value sup c .(lf) up to = c , i..e. if u is 
any set which satisfies (1) - (3), then k = c sup c (lf). 

Proof. Let C = /i(ljlf) be the Hartogs set for the union of If, which by 
Hartogs’ Theorem 7.34 is well orderable and greater in cardinality than every 
well orderable subset of |J<?, including every def. We set 

sup c (gr) = df inf c ({f? CC|(Vde W)[A < c 5]}) 

and verify easily the conclusion of the Lemma. H 

Infinite sums and products were defined in 4.21. We cannot say much 
about these, because infinite sums are as trivial as the finite ones (Problem 
x9.15), and infinite products are as complex as the Generalized Continuum 
Hypothesis, because 

2 K =< n, SK 2. 

There is, however, a very interesting inequality relating the two. 

9.21. Konig’s Theorem. (AC) For any two families of sets ( i i— > A,) and 
(i i— > Bj) on the same index set I fill), 

if (Vi G I)[Ai < : B,-]. then (J ieI Ai < c Tliei B i- (9-10) 

In particular, for families of cardinals, ( i i— > «,-) and ( i i— > f), 

if (Vi G /)[«,■ < c X ,], then << (9-1 1) 

Proof. The hypothesis and AC yield for each i an injection n, : Aj >-* By, 
and since t ij cannot be a bijection. there also exists a function c : I — > (J ieI Bj 
such that for each i, c(i) G B, \ We set 

fnfx), if x G Aj, 

/u,) = \c(0, if-v f A,. 
g{x) = (i !-»■ f{x, i)). 

If x fi y and x, y belong to the same A, for some i, then 

g{x){i) = 7ij (x) fi nfiy) = g(y)(i), 

because is an injection, and hence g(x) fi g(y). If no A t contains both x 
and y, suppose x G A t , y £ A,-; it follows that g(x)(i) = n,(.x) G nj[Aj] and 
g(y)(i) = c{i) G Bj \ m[Ai] so that again g(x) fi g(y). We conclude that the 
mapping g : (J ieI A t >-> H/g/ ' s an injection, and hence 

U ie/^i < c 

Suppose, towards a contradiction that there existed a correspondence 
h '■ U iei A i ^ II iei B i> 
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so that these two sets are equinumerous. For every i, the function 

h t (x) =df h(x)(i) (x G A,) 

is (easily) a function of A t into B, and by the hypothesis it cannot be a 
surjection; hence by AC there exists a function e which selects in each B, some 
element not in the image, i.e., 

e(0 G Bj \ hj[Ai\, (z G /). 

By its definition, e G \\,r r i Bj. so there must exist some x G Aj, for some j, 
such that h(x) = e; this yields 

s(j) = h(x)(j) = hj(x) G hj[Aj], 

contrary to the characteristic property of e. 

The cardinal version (9-11) follows by applying (9-10) to A t = {/'} x k ; - and 

b, = h 

9.22. Exercise. (AC) Konig's Theorem applies to the case I = k, A ,• = { i } and 
Bj — 2 and yields 

K = U El ,e K 2 = c 2 k , 

i.e., the theorem of Cantor. 

Despite its simplicity, Konig’s Theorem implies immediately a non-trivial 
inequality about the cardinal number c of the continuum which goes beyond 
Cantor’s K 0 < c c. It is most naturally expressed using cofinalities. 

9.23. Definition. The cofinality of a well orderable, infinite cardinal number 
k is the least well orderable cardinal X such that k is the union of A- many sets 
with cardinality smaller than k. Precisely; 

cf(re) =df inf f ({/ C k | for some indexed family (z i — > A,), e /, 

(Vz G I)[Kj < c k]&k = U/e i K i})- 

Notice that the family of well orderable index sets whose inf. we take is not 
empty; it contains k, since 

« = (9-12) 

The general properties of inf imply the following basic properties of the 
cofinality operation; 

(1) Cf(/t) < c K . 

(2) k = 1J iec f( Kj Kj for some indexed family (z i-> K t ) such that 
(Vz G cf(«))[Ai < c k], 

(3) If X is well orderable, (Vz G A)[L, < c k] and k = |J 
then cf(/«) < c X. 

Moreover, these conditions characterize cf(/e) up to = c . 

A well orderable, infinite cardinal k is regular if cf(zt) = c k, otherwise it 
is singular. It is convenient to define the operation cf(/c) and the regularity 
condition for infinite, well orderable k without assuming the full Axiom of 
Choice, but most results about these notions require AC. 



130 


Notes on set theory 


9.24. Exercise. No is regular , because every finite sum of finite cardinals is finite. 

9.25. Corollary. (AC) For each infinite cardinal number k, 

cf(2 K ) > c K, 

and in particular, cf(c) > c No, i.e., the continuum c is not the union of countably 
many sets of cardinality < c c. 

Proof. By Konig’s Theorem, if K, < c 2 K for every i G 2 with 2 < c k, then 

U i^K, < c n i6/l 2 K = c (2 K ) /l = c = c If 

which contradicts cf(2 K ) < c k. H 

9.26. Godel’s model L of the constructible sets satisfies the Generalized Con- 
tinuum Hypothesis, so for each k, 2 k = c k + is regular by Problem x9.19. 
Using Cohen’s forcing method, it is possible to construct models of Zermelo’s 
theory ZDC + AC in which c is singular, with cofinality cf(c) some regular 
cardinal between N 0 and c, for example Ni. 

We have left the basic properties of cofinalities for the problems, as they 
are very simple. We should remark, however, that it is not possible to study 
the topic seriously now, because without the Axiom of Replacement, it is not 
even possible to prove that singular cardinals exist ! 


Problems for Chapter 9 

*x9.1. Show Theorem 9.1 using only the constructive axioms (I) - (VI) and 
the Countable Principle of Choice ACn. 

x9.2. Consider a system of airline routes which connects the (possibly infin- 
itely many) cities of some world and assume the following. (1) From each 
city, there are only finitely many cities to which one can fly non-stop. (2) It 
is possible to travel by air from every city to every other city. (3) It is not 
possible to keep flying forever without visiting the same Airport twice. Show 
that this world has only finitely many cities. 

x9.3. Show Konig’s Lemma 9.7 for the case where T is a tree on a well 
orderable set E, with no appeal to choice principles. 

*x9.4. Show Konig’s Lemma 9.7 using only the constructive axioms (I) - (VI) 
and the Countable Principle of Choice ACn. 

x9.5. Suppose T is a finitely splitting tree and B is a bar for T. Show that 
there exists some integer k, such that for all infinite branches / G [T], some 
/(/) G B , with i < k. 

*x9.6. Suppose C is a well orderable set, / :CxC— >C,ACC, and let 
A f = df f|{V C C | A C X&f[X x X] C X} 
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be the closure of A under /. Define the sets { A n }„ eN by the recursion 

Aq = A , A n+ i = A n U f \A n x A n ], 

and show that 

A-f = U neN^«' 

Show also that if A is infinite, then A f = c A. 

x9.7. Show that if C is well orderable, then so is the set C * of all words (finite 
sequences) from C. 

x9.8. If you used ACn or DC in Problems x9.6 and x9.7, do them again, using 
no choice principles at all. 

x9.9. Every Hartogs set h{A) is best wellordered by < /(4) , 

x9.10. If {n * ^ and (ti i ^ arc sequences of cardinal numbers, 

and for every n , K n < c X n , then 

X)«eN K » — c rif!eN K « — c n«eN^«- 

x9.11. (AC) If (z i-> «j)/ e j and (z i-> A,), e / are families of cardinal numbers 
on the same index set / and for all i G I, k, < c 1, , then 

f n <e /«f — c FI 

x9.12. (AC) For every indexed family of sets (z A ( ) (£ /, 

iriie/^'l = c 

and the same for sums, with “disjoint union” on the left. 
x9.13. Explain the notation and prove the identity 

n,e/n ; e/(/) K u = c W{(i,j)\iei &jeJ(i)} Ki J ■ 

x9.14. (AC) Prove the characterization of sup, (*f) claimed in 9.20. 

x9.15. (AC) For every family of infinite cardinal numbers (z i— > k, ) on a 
non-empty index set I, 

=c max(|/|,sup c ({«,■ | i G /})). 
x9.16. Prove that for every infinite, well orderable cardinal k, 

cf(zc) = inf ( ({/ C k | for some indexed family (z <— > A/ ) ,■ £ / , 

(Vi, j G /)[/ ^ j =► A/ n K'j = 0] & (Vi G /)[*/ < c k] 
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*x9.17. (AC) Prove that for every infinite cardinal ac, 

cf(re) = inf c ({7 Cl ac | for some indexed fnmily of cnrdinnls (/ i — ^ a 

(V/ g /)[rc/ <c f - At] & at =£. y;,. c r At/ . 

Hint. It helps to use Problem x9.16. 

*x9.18. Show that for every infinite cardinal At, cf(cf(/t)) = c cf(At), and hence 
cf(At) is always a regular cardinal 

x9.19. (AC) For each infinite cardinal At, the next cardinal At + is regular. Hint. 
Problem x9.17 simplifies the proof. 

x9.20. (AC) Show that for each infinite cardinal At, k < c At cfW . 

*x9.21. (AC) Every partial ordering < on a set P has a linearization, i.e., some 
linear ordering <' of P exists such that x < y ==> x <’ y. 

The next problem gives the basic fact which relates inductive and directed- 
complete posets. Notice that by “chain” in a poset P we mean any subset 
C C P which is linearly ordered by the poset partial ordering < P \ C is a well 
ordered chain if in addition, the restriction of </• to C is a wellordering. When 
we say that “S is well orderable” for some .S' C P. we mean that S admits 
some wellordering <, which may be (and typically is) totally unrelated with 
the given partial ordering < P of P. 

*x9.22. If every well ordered chain in a poset (P, <p) has a least upper bound, 
then for every well orderable, directed subset S of P there exists a well ordered 
chain C with the following two properties. 

(1) S is dominated by C, i.e., for each x G S there exists some y € C such 
that x < P y. 

(2) For each y e C, there exists a directed subset C y C S such that 
\C y \ < c |5| and y = sup C y . 

Notice that C may satisfy these conditions without being a subset of S. 
Hint (W. Allen). Towards a contradiction, let S a be well orderable, directed 
and < f -lcast counterexample to the conclusion, verify first that S must be 
uncountable, and let < be a best wellordering of S. Define the function 
/ : S x S S so that x, y € S => x,y< P f (x, y), and for every x € S , set 

C x =df segU)/, 

with the notation of Problem x9.6. Show that this is directed, that sup C x 
exists for each x € S, and that 

C = df {sup C x | x G S} 

is a well ordered chain in P which has properties (1) and (2) for S. 

*x9.23. (AC) The following three conditions are equivalent, for every poset P: 

1. Every directed set in P has a least upper bound. 

2. Every chain in P has a least upper bound. 
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3. Every well ordered chain in P has a least upper bound. 

In particular: (AC) A poset is inductive if and only if it is directed-complete, a 
dcpo. 

*x9.24. (AC) Show that a monotone mapping n : P — > Q on one inductive 
poset to another satisfies the identity 

^(supA) = sup7t[S] (9-13) 

for every non-empty chain S C P, if and only if it satisfies (9-13) for every 
non-empty directed S C P. 

x9.25. Prove that the characterization of continuity for mappings of the form 
7 1 : (A E) (B ^ M ) in x6.24 holds for all sets A, E, B, M. 

x9.26. (AC) Finite Basis Lemma. Let -f be a non-empty family of subsets of 
some set V, such that 

le / 4=4- (VT C X)[Y finite=4 Y € J"\. 

Show that A/ has a maximal member (under C). 

* x9.27. Let S be a family with the finite basis property as in x9.26 and assume 
in addition that V is well orderable; show (without AC) that has a maximal 
member. 

x9.28. (AC) If you know what vector spaces are and the basic facts about 
linear independence, prove that every vector space has a basis. Prove also 
without AC, that every well orderable vector space has a basis. Hint. Apply 
x9.26 or x9.27 to the family of all linearly independent subsets of the given 
space. 

*x9.29. (AC) If you know something about fields and algebraic extensions, 
prove that every field has an algebraic closure. Hint. The usual argument for 
this runs as follows. We consider the class 

sA = df {F | F is an algebraic extension of A} (9-14) 

partially ordered by 

F\ C F 2 4=4-df F\ is a subfield of F 2 , 

we notice that it is an inductive poset, so that it has a maximal element K, 
and we verify that this K is algebraically closed. The argument is defective, 
because the class sf in (9-14) is not a set. To correct it, in the interesting case 
where K is infinite, we need to notice that every algebraic extension of K is 
isomorphic with some field F = c K, so we can replace st in (9-14) by 

=df {F C E | F is an algebraic extension of A}. (9-15) 

where E is some superset of A with cardinality greater than A. 
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* x9.30. Prove that every well orderable field (and in particular, every countable 
field) has an algebraic closure. Hint. The idea is to avoid AC by using 
Transfinite Recursion to construct the closure explicitly. You still need the 
trick suggested in the previous problem. Do the countable case first, it clarifies 
which algebraic results are needed. 



CHAPTER 10 


BAIRE SPACE 


Next to the natural numbers, perhaps the most fundamental object of study 
of set theory is Baire space, 

Af =d f (N — > N), (10-1) 

the set of all number theoretic sequences. If we let 

C=df (N-»{0,1}) (10-2) 

be the Cantor set 23 of all infinite, binary sequences, then 

C C AT C p(N x N), 
and with now familiar computations, 

c = c 2 H ° = c \V(N)\ = c \C\ < c \Af\ < c |P(N x N)\ = e |P(N)| = c. 

Since Af = c R will follow as in Chapter 2 from the proper definitions in 
Appendix A, the Continuum Hypothesis 3.2 is equivalent to the proposition 

(CH) {MX C Af)[X <c N V X = c Af]. 

In fact, there is such a tight connection between Af, C and R that practically 
every interesting property of one of these spaces translates immediately to a 
related, interesting property of the others. In the problems we will make this 
precise for Af and C and in Appendix A for R, where we will also draw the 
consequences of the results of this chapter for the real numbers. 24 


The material in this chapter is not necessary for the comprehension of the two chapters which 
follow, and the exposition is more condensed and requires more effort from the reader than the 
rest of these Notes. In a first reading, it may be best to skip it and come back to it after Chapters 
1 1 and 12 have been mastered. 

23 It is traditional to use the same name for this subset of M and the set of real numbers defined 
in the proof of 2 . 14 . Figure 2.4 explains vividly the reason for this, and nobody has ever been 
confused by it. 

24 One may think of M as a “discrete", “digital”, or “combinatorial” version of the “con- 
tinuous” or “analog” R. A real number x is completely determined by a decimal expansion 

x(0) ,x(l)x(2) where (n i— > x(n)) S A/\ but two distinct decimal expansions may compute 

to the same real number. This is a big “but”, it is the key fact behind the so-called topological 
connectedness of the real line which is of interest in analysis, to be sure, but of little set theoretic 
consequence. We may view Baire space as a “digital version” of R because it does not make any 

such identifications, each point x S Af determines unambiguously its “digits” x(0).x(l), 
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(2,U) 


(1,2.3) 

(1,2,1) 


( 0 , 1 , 1 ) 

( 0 . 0 . 0 ) 


Figure 10.1. A small part of Baire space. 


Our aim here is to establish some elementary facts about A/ - which bear on 
the Continuum Problem. We will define the family of analytic subsets of A f 
and prove that every analytic set satisfies the Continuum Hypothesis, in the 
sense that it must be either countable or equinumerous with M. This Perfect 
Set Theorem 10.20 is significant because essentially every set of interest in 
classical analysis is analytic, including all the Borel sets which play a funda- 
mental role in measure theory and integration; by Suslin’s Theorem 10.31, 
a set A C M is Borel exactly when both A and J\f \ A are analytic. On the 
other hand, we will show in Theorem 10.32 that the basic and natural method 
of proving the Continuum Hypothesis for analytic sets cannot be extended 
to solve the full Continuum Problem, which remains open. In addition to 
their applications in analysis, Theorems 10.20 and 10.32 are of substantial 
foundational interest, as their proofs illustrate beautifully the role of choice 
principles in classical mathematics. 

10.1. The structure of A/\ Our intuitions about J\f come from picturing it as 
the body of the largest tree on N in the terminology of 9.3 and (9-3), 

JV=[N*]. 

We will refer to subsets of Baire space as pointsets, the term “point” temporar- 
ily reserved for members of A f, infinite branches in N*. By the complement of 
a pointset, we will mean its complement in Af, 

cA =df A/"\ A. (10-3) 

It is convenient to extend the initial segment notation on strings, 

u C x u C x (u € N*,x € A f). 


(10-4) 
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to indicate that a finite sequence u is an initial segment of the point x, an 
approximation of x which determines the first lh(w) values of x. For each 
u £ N* . the set 

N u =df {a g M | u E x} = [N* u ] (10-5) 

of points in A f which extend u is the neighborhood determined by u in A f. 

10.2. Exercise. For all u. v £ N*, 

u c v M v c M u . 

10.3. Exercise. The family of neighborhoods is countable. 

10.4. Definition. A pointset G is open if it is a union of neighborhoods, so 
that 

x £ G (3 u)[x £ M u & M u C G]: (10-6) 

closed if its complement is open; and clopen if it is both closed and open. 

Open sets are often defined by the following, easily equivalent condition; 

10.5. Exercise. A set G C M is open if and only if for every x £ G. there exists 
some neighborhood M„ such that x £ M u Q G. 

10.6. Proposition. (1) 0, M and cdl neighborhoods are clopen. 

(2) Every singleton {x} is closed but not open. 

(3) Every non-empty open pointset is the union of a sequence of neighborhoods. 

(4) The union (J & of a family 5? of open pointsets is open and , dually, 
the intersection f\SF of a family SF of closed pointsets is closed. (We assume 
f)0 = Af,so the intersection operation is defined for every family of pointsets.) 

(5) The intersection Gi D G 2 of two open pointsets is open, and dually, the 
union F\ U F 2 of two closed pointsets is closed. 

Proof. (1) Each neighborhood is open, since M u = (J {A/),}, and, in par- 
ticular, M = M( ) is open. Neighborhoods are also closed by Exercise 10.5; 
if x £ M u , then u % x, so there exists some i < lh(w) such that x(i) f u(i) 
and then x £ A/^o),... , x (t)) while A/( x (o),... , x (i)) n M u = 0- The empty set is the 
union of the empty (!) family of neighborhoods, formally 

0 = U{^« IA/LC0}. 

(2) A singleton {x} is not open, because it does not contain any neighbor- 
hood, and so it cannot be a union of neighborhoods. Its complement is open, 
however, since 

Af\ {x} = (J{A/L I » % x}. 

(3) If G is open and non-empty, then the set {« | M u C G) is non-empty 
and countable, and so it can be enumerated. 

(4) This is immediate from the characterization of open sets in Exercise 10.5. 

(5) If G\, 63 are open and x £ G\ D Gj, then there exist u,v Q x such that 
M u C G\ and M v C G 2 . The finite sequences u, v are comparable since they 
are both initial segments of x, so suppose u C v, the argument being the same 
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in the opposite case: now D A f v , so M v C G\ fl Gi, as required. The dual 

property for closed sets follows by taking complements. 3 

Baire space is a topological space by the classical definition recounted in 
4.30, but a very special one, because of the next, basic connection between its 
topology and the combinatorial structure of the tree N*. In proving it — and 
in the sequel, routinely — we will use the following trivial equivalence relating 
a tree T and its body: 

x G [T] ^ (Vw C x)[u G T], (10-7) 

It follows immediately from the definition of [T], (9-3). 

10.7. Proposition. A pointset F is closed if and only if it is the body of a tree T 
o«N, F = [T], 

Proof. If x ^ [ T\, then for some u C x, u ^ T, and then M u fl [ T ] = 0, 
so M u C c[7’]; thus c[T ] is open and [T] is closed. Conversely, if we associate 
with each pointset F the tree 

T f = df {u G N* | (3x G F)[u C x]}, (10-8) 

then obviously 

F C [T f ], 

If F is closed, we also have [T F ] C F: because if x ^ F, then for some 
u C x, M„ D F = 0 by the openness of the complement cF , hence u ^ T F 
and x $ [T f ] by (10-7). H 

This basic characterization allows us to classify closed pointsets by the 
combinatorial properties of the trees which define them. It is not wrong to 
think of the cluster of combinatorial notions to come as the combinatorial 
geometry of M, although it is not a “geometry” by any standard, classical 
definition of this term. 

10.8. Definition. Set 

u | v m, v are incompatible (10-9) 

(3/ < lh(w), lh(w))[w(?) f u(/)], 

and, by extension, 

U I X ^=4>df ~{u C x] (3u C x)[u | v]. 

A string u splits in a tree T if it has incompatible extensions in T and a tree T 
is splitting if every iigT splits in T . 

u G T => (3wi, «2 G T)[u C u\ & u C « 2 & hi | W 2 ]. 

Notice that a splitting tree has no terminal nodes. 

A pointset P is perfect if it is the body of a splitting tree. Perfect sets are 
automatically closed. 

10.9. Exercise. Each neighborhood M u is perfect. 

10.10. Proposition. Every non-empty, perfect pointset P has cardinality c. 
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Proof. Suppose P = [T] with T non-empty, splitting, and choose func- 
tions 

/ : T -> T. r : T -> T 

which witness the splitting property for T , i.e., for each u G T , 
uQl(u), uQr(u), l(u)\r(u). 

By the String Recursion Theorem 5.33, there is a function 

a : {0,1}* -> T 

from the tree of all binary strings into T which satisfies the identities 

cr(0) = 0, ct(m*(0)) = l(a(u)), <t(m*(1)) = r(a(u)). 

Thus g(u *(/)) is a proper extension of o{u) for i = 0,1, so a is strictly 
monotone, 

u v ==> a(u) ft c(u), 
and we can define a function n : C — > [T] by 

7t(x) =df sup {o(u) | u C x}. (10-10) 

The key property of a is that it also preserves incompatibility, 

u\v=>o(u)\a(v). (10-11) 

To see this, let i be least such that u{i) ^ v{i), so for some w we have 

w *(0) Qu, w *( 1) C v 

(or the other way around); now < 7(10 *(0)) and cr(tu*(l)) are incompatible 
by their definition, and the monotonicity property implies that cr(«), a{v) 
extend them, so they are incompatible too. Finally, (10-11) implies that n is 
an injection, and this establishes that C < c [T], which is all we need. H 

This simple abstraction of Cantor’s proof of the uncountability of the reals 
(2.14) suggests an attack on the Continuum Problem; to prove that an un- 
countable pointset has cardinality c, it is enough to show that it contains a 
non-empty, perfect subset. This is trivially true of open sets (because each A f u 
is perfect) and it is also true of closed sets, less trivially. 

10.11. Cantor-Bendixson Theorem. Every closed subset F of M can be decom- 
posed uniquely into two disjoint subsets 

F = PUS, PCS' = 0, (10-12) 

where P, the kernel of F, is perfect and S, the scattered part of F, is countable. 

It follows that every uncountable, closed pointset has a non-empty, perfect 
kernel and hence has cardinality c. 

Proof. Let T = T F as in (10-8), so that T has no terminal nodes and 
F = [T], and set 

S = d f U {[?»] | w G T & |[T U ]| < c ^o}, 

P =df F \ S. 



140 


Notes on set theory 


By its definition, S is the union of countably many, countable sets, so it is 
countable (note the use of AC i here), and it remains to show that P is perfect. 
The set of strings 

kT = {uGT | |[7),]| > c Ho} 

is easily a tree, and 

x € S x € F & (3m C x)[u £ kT] 
is another way to read the definition of S. Since P = F \S, 

x G P x £ F & [x ^ F V (Vm C x)[u G kT]] 

(Vm C x)[u G T] & (Vm C x)[m G kT] 

(Vm C x)[m G kT] 

4 = => x G [A'7 1 ]. 

and it is enough to prove that k T is splitting. Suppose, towards a contradiction 
that some u G kT does not split. This means that all extensions of u in kT 
are compatible and they define a single point 

x = sup{u G kT | m Cv}. 

Since every extension of u in kT approximates x, 

[T u \ = {x} U U {[T„] | m C v G T & | [7V] | < c Ho}; 

this, however, implies that [T„] is a countable union of countable sets, which 
is absurd. 

We leave the uniqueness of the decomposition (10-12) for the problems, 

xl0.2. 3 

10.12. Definition. A family T of pointsets has property P if every uncountable 
set in T contains a non-empty, perfect subset. In this classical terminology, 25 
the family T of closed pointsets has property P, or (more simply) every dosed 
point set has property P. 

10.13. Exercise. If a family of pointsets F has property P, then the family T ff 
of all countable unions of sets in T also has property P. 

Thus every T a pointset, of the form 

d = (10-13) 


25 The classical terminology in question is quite absurd, but so well established that it would be 
folly to change it or bypass it. In any topological space, closed sets are JT'-sets, from the French 
fermet: open sets are G-sets, from the German Gebiete (it means region ): countable unions of 
T-sets are T^-sets, and countable intersections of T-sets are T^-sets. from the German words 
Summe and Durchschnit for union and intersection, respectively. We will only use this terminology 
in passing references to T a and Q s pointsets. 
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with each F„ closed has property P. The same is true of every Qg set. of the 
form 

^ = n„ eN G„ (10-14) 

with each G n open, but the proof is not that simple, and it is ultimately easier 
to establish directly the property P for the much larger family of analytic 
pointsets. 

10.14. Definition. Recall from 6.25 that a function / : X — > Y on one topo- 
logical space to another is continuous if the inverse image / _1 [G] of every 
open set in Y is open in X. A pointset ACJ\f is analytic or Suslin, if either 
A = 0 or A is the image of Baire space under a continuous function, in symbols, 

A =df {A C M | A = 0 V (3 continuous f : M —> M)[A = /[A/ - ]]}. 

Continuity in M has a simple, combinatorial interpretation which is the key 
to its applications. 

10.15. Theorem. A function f : M — > M is continuous if and only if there exists 
a monotone function x : N* — > N* on strings, such that 

f(x) = sup (t(m) I u E x} (x G M) (10-15) 

= lim„ x(x(n)). ( 

When t : N* — > N* is monotone and (10-15) holds, we say that x computes 
the function /. 

Proof. If / satisfies (10-15), then 

f{x) G M v <==A- (3m C A)[r | E t(m)], 

so each inverse image of a neighborhood 

f~ l Wv] = U{-^ I v E t(m)} 

is a union of neighborhoods and / is continuous. 

For the more difficult converse, suppose / is continuous and let 

S(u) = df {v G N* | f[M u \ C M v } (; u G N*). 

Each S{u) f 0, since the root 0 G S(u); v E v' G S(u) => v G S(u); and 

v, v' G S(u) => f[M u \ c M v njV„' =$■ [u E v' V v' E v]. 

since v \ v' => M v fl M v ' = 0. Thus, there are two possibilities: 

Case 1. There is some v G S(u) such that lh(u) = lh ( m). In this case we set 

x(u) =df v = the unique string in S(u) such that lh(u) = lh (u). 

Case 2. There is no v G S(u) such that lh(u) = lh(w). Now we set 

x{u) =df sup{u | v G S(u)}. 

The monotonicity of x follows easily from the implications 

Ml E u 2 =>f[M Ul ] 2 f[M Ul \=^ s{u\) c S(u 2 ), 
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considering the possibilities in the definitions of t(u\) and t(«2). To prove 
(10-15), notice first that because r(w) G S(u), 

Af => f{x) g Af z ( u )=>z(u) c /(x). 

Moreover, by the continuity of /, if v C / (x), then there is some u C x such 
that f[N u \ C A/),, hence u G S(w) and either immediately u C r(w), if t(k) is 
defined by Case 2, or there is some «' extending w, with lh(w') = lh(u) such 
that v = t(m') in the other case. H 

It is useful to think of (10-15) as a computational characterization of con- 
tinuity: the string function r (u) gives us better and better approximations 
z(u) C f(x) to the value of /, as we feed into it successively finer approxi- 
mations u C x to the argument. We can turn this picture into a precise and 
elegant result, in terms of the notions introduced in Chapter 6. 

10.16. Corollary. A function f : Af — > Af is continuous if and only if it is the 
restriction to Af of some monotone , continuous mapping 

n : (N — N) (N — N) 

on the inductive poset (N — >• N). By Definition 6.22, a monotone mapping 
7i : (N — ^ N) — > (N — > N) is continuous if it satisfies the equivalence 

7i(x)(i) = w •*=>■ (3u G N*)[« C x & n{u){i) = to]. 

Here we are using the fact that Af is a subset of (N — *• N) , consisting precisely 
of all its maximal elements, and the basic observation is the decomposition 

(N — - N) = N* UAf, N* n A/" = 0. (10-16) 

Proof. If / : N — » Af is continuous, let r compute it by the Theorem and 
take (literally) 

71 = T U /, 

i.e., 7 t(u) = t (m) for u G N* and 7 i(x) = f (x) for x G Af. The continuity of n 
is trivial. The converse is very easy. H 

10.17. Exercise. Prove the “ easy converse”, i.e., that if f : Af Af is the 
restriction to Af of some continuous n : (N — ^ ■ N) — > (N — ^ N), then f is 
continuous. 

The Corollary makes it possible to recognize continuity of specific functions 
on Baire space instantly, by inspection, simply noticing that every digit / (x) (i) 
of each value /(x) can be computed using only finitely many values of x. 
As in Chapter 6, a passing remark that some function or other is “evidently 
continuous” accompanied by no proof typically means an appeal to this result. 

10.18. Definition. A pointset K C Af is compact if K = [T] with a tree T on 
N which is finitely branching. In particular, every compact pointset is closed, 
and C is compact. 

Some cheating is involved in adopting this as the definition of compactness 
for pointsets, since there is a perfectly general definition of compactness for 
sets in arbitrary topological spaces, by which 10.18 is a theorem. Without 
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comment, we did the same for perfection, which is also a general, topological 
notion. What we need here are the combinatorial properties of these pointsets 
specific to Baire space and we have relegated their topological characteriza- 
tions to the problems, xl0.17 and xl0.21. 

10.19. Proposition. (1) The image f[K ] of a compact point set K by a continu- 
ous function f : M — > Jf is compact. 

(2) The image f[K ] of a compact and perfect pointset K by a continuous 
injection f : M >— > N is compact and perfect. 

Proof. (1) Suppose K = [ T ] where T is finitely branching and r computes 
/ as in (10-15), and let 

s = T fm = | ( 3x e k)[v C /(x)]} 

be the tree of initial segments of the image f[K\. It is enough to prove that S 
is a finitely branching tree and f[K] = [5]. 

To see first that S is finitely branching, suppose v G S. let 

B =df {u G T | v Q x(u) V v | r ( m)}, 

and suppose that x G [T]. If v|/(x), then for some n. v\ t(x(«)), so 
x{n) G B: and if v C /(x), then for some n. v C x(x(n)), and again 
x(n) G B. Thus, B is a bar for T . and by the Fan Theorem 9.9 it must have a 
finite subset 

B 0 = {m 0 , • • • , m„} C B 

which is also a bar. Thus, for every x G K such that v Q f (x) , there exists 
some ut such that v Qx (m,) C / (x), so that every child of v in S is an initial 
segment of some t(m, ), and there are only finitely many of those. 

Clearly, f[K] C [S], To prove [S] C f\K\. suppose towards a contradic- 
tion that y G [S'] \ f[K] and let 

B =df {u G T | t(m) | y}. 

Now B is a bar for T . because the only way that x(x(n)) can be compatible 
with y for every n is if / (x) = y. By the Fan Theorem again, there is a finite 
subset 

Bo = {m 0 . . . . , u n } C B 
which is also a bar for T . Let 

k = max{lh(r(M,)) | i < n} + 1, 

and choose some x G [T] such that y{k) C /(x), which exists because 
y G [5], so that y can be approximated arbitrarily well by points in the 
image of /. On the other hand, m, C x for some i since B» is a bar; hence 
x(ui) C /(x) because x computes /; so both x («,•) and y{k) are initial 
segments of /(x), and hence compatible; and since x (m,) has smaller length 
than y(k), this means that t(m ; ) C y, which contradicts the definition of B. 
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(2) With the same notation as in (1) and the additional hypothesis, let 
v G S, so that for some u G T, v E x (u). Since T is splitting, there exist 
distinct points 

x \,%2 G K n A f u , 

and since r computes /, 

t(m)E/Ui), t(w)C/(x 2 ). (10-17) 

But f(x 1 ) /(x 2 ), because / is an injection, so there exist incompatible 
v\ E fix 1 ), V 2 E fix 2 ), which extend r(w) by (10-17), so they split r(w) and 
hence the smaller v E t(m) in A. H 

10.20. Perfect Set Theorem (Suslin, 1916). Every uncountable, analytic set has 
a non-empty, perfect subset. 

Proof. Assume that A = /[N*] is uncountable, suppose x computes /, 
and let 

T = df {u G N* | \f[K]\ >cM- (10-18) 

Clearly, T is a non-empty tree. 

Lemma. The tree T is x-splitting, i.e., for each u G T, there exist u\ , 112 G T 
such that 

U E Ml, u E U 2 , t(hi)|t(w 2 ). 

Proof. For any h g T and any fixed x G A f u , 

W«] = {fix)} u U I t(m') 1 fix)} (10-19) 

since / (y) f f (x) => 7 ( 1 /') E / (>’) for some u' such that r («') is incompat- 
ible with /(x). If the Lemma fails at u, then 

m E u' G T E /(x); 

thus each image /[A/),'] with t(m') | / (x) in (10-19) involves some u' T and 
is countable, and there are only countably many choices for u' . Thus, f[Af u ] 
is the union of a singleton and a countable family of countable sets, hence 
countable, contrary to hypothesis. H (Lemma) 

As in 10.10. we choose functions 

/ : T -> T. r : T —> T 

which witness the r-splitting property for T , i.e.. for each u G T , 
u E /(«), u E fiu), t(/(m)) I z{r{u)), 
and we define by the String Recursion Theorem 5.33 a function 

a : {0,1}* — T 

from the tree of all binary strings into T which satisfies the identities 

a (0) = 0 , ct(m*(0 )) = l(cr(u)), ct(m*(1}) = r{a{u)). 
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and which, as a consequence, is monotone. The key property of this a is that 
it takes incompatible binary strings to t -incompatible strings, 

u\v=>z(o(u))\z{o(v)), (10-20) 

and it is verified exactly as (10-11) was verified in the proof of 10.10. In 
addition, a computes a continuous g : C — > Af, 

g(x) = sup{cr(«) | u C x}, 

and evidently 

g[C] C [T], (10-21) 

Consider now the composition h = fg of the given / and this g, which is 
computed by the composition of z and <r: 

h(x) = sup{t(ct(m)) | u C x } . 

This is continuous and injective by (10-20), so its image h[C] = f g[C] is 
compact and perfect by 10.19, and it is included in f[T] C A by (10-21). H 

The result means nothing, of course, until we prove that there are lots and 
lots of analytic sets. 

10.21. Lemma. Every closed pointset is analytic. 

Proof. Let T = T F as in (10-8) for the given closed set F / 0. so in 
addition to F = [T] we also know that every string in T has an extension, i.e., 
there are no terminal nodes. Thus, we can fix a function / : T — > T such that 

u £ T =>■ u □/(«)& lh(/(w)) = lh(w) + 1. 

Let also 


rtail(w) =df u ([0. lh(«) - 1) (lh(ii) > 0) (10-22) 


be the partial function which strips each non-empty string of its last element. 
By the String Recursion Theorem 5.33, there is a function r : N* — * T such 


that 


t(w) 


u, if u G T. 

/ (r(rtail( «))), if u ^ T, 


which is (easily) a projection of N* onto T, i.e., it is total, length preserving 
and the identity on T , and which (as a consequence) computes a function 

H 


10.22. Lemma. Every continuous image of an analytic pointset is analytic. 

Proof. If A = f[B ] and B = g[A r ], then A = f g[N], and the composition 
fg is continuous. H 

10.23. Lemma. Iff, g : J\f —> Af are continuous functions, then the set 


E = {a | f{x) = g(x)} 


of points on which they agree is closed. 
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Proof. Because distinct points can be approximated by incompatible initial 
segments, 

X i E f{x) f g(x) 

(3m, v)[f(x) G M u & g(x) G Af v & u | t>], 

which means that 

cE = {J{f- l [Af u ]ng- l [Af v \ | u\v}, 

so that cE is the union of open sets and hence open. 3 

10.24. Theorem. Countable unions and countable intersections of analytic point sets 
are analytic. 

Proof. Suppose that A n = /„ [TV] with each /„ continuous and define first 
/ : Jf — > Jf by the formula 

fO) = /z(o) (tail(z)), 

where 

tail(z) =df (i z(i + 1)) = (z(l), z(2), . . . ) 

is the function which decapitates points. Evidently / is continuous: because 
each /(z)(z) can be computed from finitely many values of z, first setting 
n = z(0) and then using the finitely many values of tail(z) needed to compute 
/„( tail(z)). Moreover: 

y e U JnW\ ^ On £l,r £ N)[y = f„(x)\ 

(3 z G U)[y = /-(o) (tail(z))] 

taking z(0) = n , tail(z) = x 
<=i> (3z G Jf)[y = /(z)], 

so U ,0" = y [-AT] and the union of the A„’s is analytic. 

The key fact for this argument was that the mapping 

z i— > (z(0), tail(z)) 

is a surjection of A f onto N x N — actually a bijection — with continuous 
components. To prove that the intersection fj n A„ is analytic, we need a 
similar surjection 

n : Af -*► (N ^ Af) 

of Jf onto the set of infinite sequences of points. To construct such a n. fix 
some bijection />:NxNw*N and set 

PnO) = ( i ^ z(p(n, /))). (10-23) 

Each p n : Af -> Af is clearly continuous and for each infinite sequence {x„ }„ £ n 
of points we can define z such that 

z{p{n, /)) = x„(i) (n, i G N), 


so that 


PnO ) = X, 


(n G N); 
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in other words, the mapping 

n{z) = (n p n (z)) 

is a surjection. Using A Cm now, 

y e f| „ A n (V«)(3x)[v = /„(*)] 

(3{A'„}„ eN )(V«)[v = /„(*«)] 

(3ze.A0(V/i)[y = / B (/> B (z))] (10-24) 

«=>■ (3z G A0[(Vn)[/„(p„(z)) = / 0 (po(z))] 

& J = /o(po(z))]. 

For each «, the set 

Bn = {z G N I f„(p„(z)) = fo(Po(z))} 

is closed by 10 . 23 , hence the intersection 

B = Cl n Bn 

is also closed. From (10-24), however, 

C\„ An = foPo[B], 

which means that the intersection of the A n ’s is analytic. H 

10 . 25 . Definition. The family B{X) of the Borel subsets of a topological space 
X is the smallest family of subsets of X which includes the open sets and is a 
a -field, i.e., it is closed under countable unions and complementation: 

(V»)[A fl G B(X )] =► U „A n G B(X), 

A e B{X) => cA€ B{X). 

We are mostly interested in Baire space of course, 

B =df B(J\f) = the family of Borel pointsets. 

10 . 26 . Exercise. Prove that the definition makes sense , i.e., the intersection 

B(x) = ft{w \ g eg 

& (V{A„}„ C g)[\J n A n G %] 

&(VA G g)[cA G g]} 

is a a -field which contains the open sets, and hence the least such. 

10 . 27 . Exercise. The intersection {~\ n A n of every sequence of Borel sets {A„} 
is a Borel set. 

10 . 28 . Corollary. Every Borel pointset is analytic (Suslin) and hence has prop- 
erty P (Alexandra!!, Hausdorff). 

Proof. Let 

CA = {A C Jf | cA G A} (10-25) 

be the family of co-analytic pointsets, those with analytic complements. The 
family A n CA of pointsets which are both analytic and co-analytic is a 
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cr-field, since it is closed under complementation by definition, and if each 
A n e A n CA, then (J n A n and 

n n cA n 

are both analytic by the theorem. Since every open set 

G = U„{Mj \Af u CG} 

is a countable union of neighborhoods, hence analytic, and also co-analytic 
by 10.21, A n t A is a cr-field which contains all the open pomtsets and hence 
includes every Borel set. H 

In the next two theorems we clarify somewhat the relation between analytic 
and Borel pointsets. 

A pointset C C J\f separates a pointset A from another pointset B, if 
ACC, CnB = 0. 

Notice that if some C separates A from B, then A n B = 0. 

10.29. Lemma. (1) If {A,}, { B are two sequences of pointsets and for all i, j, 
Cij separates Aj from Bj, then the set C — 1J ,■ D / Cy separates |J /A/ from 

U jBj, i.e., 

C U iHjQj, (U fl A'u) = 0- (10-26) 

(2) If {Aj}, {Bj} are two sequences of pointsets and no Borel set separates 
A = \JjAj from B = [J ■ Bj , then there exist two numbers z’o and j 0 such that 
no Borel set separates A io from Bj 0 . 

Proof. (1) For any fixed i and all j, by hypothesis, A, C Cy, and so 

Aj c n jCjj\ 

so taking the unions of both sides, we get 

A = {j i A i C{j i [} j C i] , 

which is the first inclusion claimed by the Lemma. For the second, we notice 
that the hypothesis Bj n Cy = 0 means exactly that 

Bj C cCjj ( cCjj = TV \ C y ); 
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and so, fixing i and taking unions again, we get 

B = U jBj C U jQj, 
which, since i was arbitrary, yields 

b c n,u jcQj. 

Now Problem xl.3 (De Morgan’s laws) yields 

so that B n ^[J ,P] jQj'j = 0, which was what we needed to prove. 

(2) follows easily, by contradiction: because if some Borel set Cy separated 
each Aj from each B r then (J ( .f) ■ Cy separates A from B — and it is a Borel 
set. H 

10.30. The Separation Theorem (Lusin) If A. B C J\f are analytic pointsets 
and An B = 0, then there exists a Borel point set C which separates A from B. 

Proof. Suppose A = f[J\f].B = g[A/"|, where f.g are continuous func- 
tions which by Theorem 10.15 are computed by given, monotone string func- 
tions o\ r : N* — > N*, 

fix) = lim„ o(x(n)), g(y) = lim„ T(y(n)). 

For any two strings u.v, put 

A u = fWu] = {fix) | u C x}. B v = g[Jf v \ = {g{y) \ v C y}. 

and record the fact that 

A U Q K(u), B V C Af z(v) , (10-27) 

which follows directly from what it means for o to compute / and r to compute 
g. Notice also that A@ = A.B® = B, and, easily, for all u, v, 

A = U,A*<«>’ B v = [)jB vicU) . (10-28) 

We now assume towards a contradiction that no Borel set separates A from 
B and we apply repeatedly Lemma 10.29, using (10-28): since no Borel set 
separates A = A e from B = B), there exist numbers to, jo such that no Borel 
set separates from B/ J( \: and so, there exist iuji such that no Borel set 
separates from By^y, and so, etc. Formally, we define recursively two 
sequences of numbers 

x = (to, i\, ■ ■ ■ ), y = (jo, ju-..) 

such that for all n, no Borel set separates Aj( nj from B-o n) . Now this means 
that, 

for all n,N a (x(n))^K(y(n)) f 0, (10-29) 

since otherwise Jf a (j{n)) would separate A%(„) from by (10-27); and, 
finally, (10-29) says directly that 

fix ) = g(y) G D n (%(»)) Fl A/" T (y(„))), 
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which contradicts the hypothesis, that i fl B = 0. H 

10.31. Suslin’s Theorem. A pointset A C J\[ is Borel if and only if both A and 
its complement cA are analytic. 

Proof of the non-trivial direction is immediate, by applying the Separation 
Theorem to A and cA. H 

Suslin introduced the family of analytic pointsets in 1917 and proved a slew 
of theorems about it, including the Perfect Set Theorem 10.20, his famous 
characterization 10.31, and that not every analytic pointset is Borel, which we 
will not prove here. 26 The Borel sets had been introduced more than a decade 
earlier by Borel and Lebesgue and they were the key to the successful devel- 
opment of the theory of Lebesgue integration, one of the chief achievements 
of 19th century analysis. For most purposes of integration theory, including 
its later, fundamental applications to probability, every pointset of interest is 
almost equal to a Borel set, in a precise sense which basically allows us to study 
the subject as if every pointset were Borel. Because of this, the Continuum 
Problem for Borel sets was considered very important, and its simultaneous, 
independent solutions published by Alexandra!! and FLaussdorff in 1916 (just 
before Suslin established the more general Theorem 10.20) were celebrated as 
a major achievement. 

The family of analytic sets falls far short from exhausting the powerset of 
M, Problem xl0.9. Still, one might hope that the method we used to solve the 
Continuum Problem for them might be extended to prove the full Continuum 
Flypothesis, but this too is far from the mark. 

10.32. Theorem. (AC) There exists a pointset A c N which is uncountable but 
contains no non-empty perfect set. 

Proof. The key fact is that there are exactly as many non-empty, perfect 
sets as there are points in A f\ 

Lemma 1. If 2P = {PCN\Pf$,P perfect}, then \3P\ = c c. 

Proof. For each y € A f, the pointset 

A y = {x | (Vw)[j(n) < x(/t)} 

is easily perfect, equally easily y f z=>A y f A-, hence c = c |A/] < c \3P\. 
On the other hand, each perfect set P = [ T p ] is the body of a tree on N which 
determines it, so 

\&\ <C |W)| =c |P(N)| =c C. H (Lemma 1) 

Fix a set 

I= c c=c 9, (10-30) 


- ( ’The study of analytic and Borel pointsets is the core of Descriptive Set Theory , one of the 
most beautiful parts of our subject, but (unfortunately) beyond Ihe scope of these Notes. 
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for example I = c, and bijections 

a h 4 e A f, a i— > P a G 3P (a G I) 

which witness the equinumerosities (10-30). Fix also a best wellordering < of 
I . We will deline by transfinite recursion on (/, <) injections 

/ Q : seg(a) A a c Af, g a : seg(a) B a cAf (a G I), 

so that the following conditions hold. 

(1) If a < /?, then f a C f p ,g a C g^, so that A a C ^ and B a C Bg. 

(2) For each a e I , A a C\ B a = 

(3) For each a G I, B Sa n P n / 0. where .S' is the successor function in the 
well ordered set (/,<). 

Lemma 2. //' f a ,g a ( a G /) satisfy (1) - (3), then 

A = U = c c - 

but A has no non-empty, perfect subset. 

Proof. That A = c I = c c follows immediately from (1), since 

U Jo. : / A and / = c c. 

For the second claim, the key observation is that 

A n B p = 0 (/? G /); 

this is because if x Gd a nfi^, then with y = maxja, /?}, by (1), x G B y n B y . 
contradicting (2). Now if P f 0 is perfect, then P = P a for some a G I. hence 
there exists some xGf„n B Sa and then x ^ A, so P a % A. H (Lemma 2) 
The definitions of f a , g a are practically forced on us by conditions ( 1 ) - ( 3) . 
We outline the proofs of (1) - (3) by transfinite induction together with the 
clauses of the transfinite recursion definition; pedantically the proof should 
be separated out and explained after the definition is completed. 

(a) At the minimum 0 of I, set /o = go = 0. 

(b) If X is a limit point of I, set 

f X U a<).f a ' &X U a<X& a ' 

Conditions (1) and (2) follow immediately from the induction hypothesis, and 
(3) is not relevant to this case. 

(c) Suppose f = Sa is a successor point in /. By the induction hypothesis, 
A a and B a are equinumerous with seg ( a ) and seg(a) < c I, because < is a best 
wellordering. Thus, \A a \, \B a \ are both smaller than c, hence | A a U B a \ < c c, 
and we can find in the non-empty, perfect set P a = c A f distinct points 

x, y G P a \ {A a \JB a )i 

we set 

f p = fa U {(a, x)}, g p = g a U {(a, >’)}■ 
and ( 1) - (3) follow easily. 


H 
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The construction obviously proves more than what is claimed in the theo- 
rem: \A\ = c c and both A and its complement cA intersect every non-empty, 
perfect set. We leave for the problems some additional variations which make 
it even more obvious that the program of proving the Continuum Hypothesis 
by using the Cantor-Bendixson Theorem is hopeless. 

Actually, it is not only this program for settling the Continuum Problem 
which fails: every attempt to prove or disprove CH from the axioms of ZDC + 
AC is doomed, by the following two central independence results. 

10.33. Consistency of GCH, the Generalized Continuum Hypothesis (Godel, 
1939). The model L of constructible sets satisfies the Generalized Continuum 
Hypothesis GCH, (9-7), so, in particular, the Continuum Hypothesis cannot 
be refuted in ZDC+AC. 

10.34. Independence of the Continuum Hypothesis CH (Cohen, 1963). There 
is a model of ZDC+AC in which the Continuum Hypothesis fails, hence CH 
cannot be proved in ZDC+AC. Cohen’s forcing model can be modified in 
many ways to manipulate the cardinalities of pointsets and subsets of larger 
powersets. 

10.35. What does the independence of CH mean? Both the Godel and the 
Cohen methods of proof are very robust and they have been adapted to show 
that the Continuum Problem cannot be settled on the basis of many reasonable 
and plausible strengthenings of ZDC+AC by additional axioms. The same 
is true of the Axiom of Choice, of course, or the Axiom of Infinity for that 
matter, but it is clear that these propositions express new, fundamental set 
theoretic principles which are most likely true but cannot (and, in fact, cannot 
be expected to) be proved from simpler axioms by logic alone. The Continuum 
Hypothesis has the look of a technical, mathematical problem which should 
be settled definitively by a proof, but we seem to lack the insight needed to 
divine the necessary axioms. 

Much has been made of this independence of CH (and many more as- 
sertions about sets) from variants of the known axioms of set theory, and 
some have used it to argue against any objective reality behind the “formal”, 
axiomatic results of the subject. Using the method of arithmetization intro- 
duced by Godel, however, we can translate questions about the existence of 
proofs into precise, technical conjectures about integers: since there exist such 
conjectures 27 which (like CH) can be shown to be undecidable in the known, 
plausible axiomatic theories, are we then forced to deny objective reality to 


- 7 The type of statements we have in mind here are of the form “if ZDC+AC is consistent, then 
so is T”, where T is some strong extension of ZDC+AC which, in fact, implies the consistency 
of ZDC+AC. Godel’s Second Incompleteness Theorem implies that statements of this type are 
independent of ZDC+AC (unless ZDC+AC is inconsistent), and there are many of them about 
whose truth there is genuine controversy. 
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the natural numbers? In fact, it is not possible to discuss such problems intel- 
ligently without reference to notions and results of Mathematical Logic which 
are beyond the scope of these Notes, and we will resist the temptation. 

Incidentally, there are scores of interesting propositions about sets which 
cannot be settled on the basis of ZDC or ZDC+AC: CH is only the most 
interesting of them. We mention here just three more independence results of 
this type, because they are relevant to the Perfect Set Theorem 10.20. 

10.36. (Godel, 1939) In the model L of constructible sets, there exists an 
uncountable, co-analytic set which has no proper perfect subset. This means 
that we cannot improve the Perfect Set Theorem 10.20 in ZDC+AC to show 
that every co-analytic pointset has property P. 

10.37. (Solovay, 1970) There is a model of ZDC+AC in which every “defin- 
able” pointset has property P. We will not attempt to define “definable”, but 
analytic complements are definable. 

10.38. (Solovay, 1970) There is a model of ZDC in which every pointset has 
property P. 

Solovay’s models are constructed by Cohen’s forcing method, but like 
Godel’s L, they have many more canonical properties which yield numer- 
ous unprovability results. The first Solovay model witnesses (with 10.36) 
that the property P for analytic complements cannot be proved or refuted 
in ZDC+AC. The second Solovay model shows that ZDC cannot prove the 
existence of an uncountable pointset with no non-empty, perfect subset; DC 
is not a sufficiently strong choice principle to effect the construction. 


Problems for Chapter 10 

xlO.l. Prove that if F C Af is closed, then there is a unique tree T onNwithout 
terminal nodes such that F = [T], namely the tree T F defined in (10-8). 

xl0.2. Prove that the decomposition (10-12) of a closed pointset F into a 
perfect set P and a countable set S determines uniquely P and S. 

xl0.3. Give an example of a closed pointset F C Af and a continuous / ; 
Af — > Af, such that the image f[F] is not closed. 

xl0.4. Prove that every open pointset is an T a and every closed pointset is a 
Gs- The definitions are reviewed in Footnote 25. 

^ x 1 0.5. Prove that the inverse image g _1 [+] of an analytic pointset A by a 
continuous function g ; Af — > Af is analytic. Hint. Aim for an equivalence of 
the form 

yeg~ l [A] +=+ {3x)[y = = g(p 2 (x))] 

where / is continuous and p„ are defined by (10-23), and then use 10.23. 
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*xl0.6. Prove that 

•W = u ieN^/ ==*■ =C A/ - ]. 

i.e., A f is not the union of a countable sequence of pointsets of smaller cardi- 
nality. Hint. This follows easily from Konig’s Theorem 9.21, but the relevant 
special case does not need the full Axiom of Choice. 

xl0.7. (AC) Prove that for every k < c c. there exists a pointset A with | A \ = c k 
which contains no non-empty, perfect subset. 

xl0.8. (AC) Prove that there exists an uncountable pointset A such that nei- 
ther A nor its complement contain an uncountable Borel set. 

xl0.9. Prove that there are c-many analytic and Borel pointsets, 

1-41 =c \B\ = c c. 

10.39. Definition. A function f : X — > Y from one topological space into 
another is Borel measurable if the inverse image / -1 [G] of every open subset 
of Y is a Borel subset of X. 

xlO.10. The composition gf : X — > Z of two Borel measurable functions 
/ : X — » Y and g : Y —> Z is Borel measurable. 

xlO.ll. The inverse image f~ x [A] of a Borel set A C Y by a Borel measurable 
function / : X — > Y is a Borel subset of X. 

10.40. Definition. Two topological spaces X, Y are Borel isomorphic if there 
exists a Borel measurable bijection / : X >— » Y, whose inverse : Y >— » X 
is also Borel measurable. Borel isomorphic spaces have the same measure- 
theoretic structure and for all practical purposes can be “identified” in measure 
theory. 

*xl0.12. Suppose / : X >— > Y andg : Y >— ► X are Borel measurable injections 
between topological spaces, with the following additional property: 28 there 
exists Borel measurable functions f\ : Y — » X and g\ : X — > Y which are 
inverses of / and g in the sense that 

fif{x) = x (xeX), 

gig(y) = y (y e t). 

Prove that X and Y are Borel isomorphic. Hint. Use the proof of the 
Schroder- Bernstein Theorem 2.26. 

xl0.13. Consider the Cantor set C as a topological subspace of J\f in the 
obvious way, the open sets being unions of neighborhoods of the form 

Mu = {x & C\ uQ x} (■ u e {0, 1}*). 

Prove that C and M are Borel isomorphic. 


28 In fact, every Borel injection has this property, but the proof of this requires some work. 
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In the remaining problems we explore the connection of the specific, com- 
binatorial notions we studied in Baire space with their general, topological 
versions. 

10.41. Definition. A point x is a limit point of a set A in a topological space 
X if every open set G which contains x contains also some point of A other 
than x, 

(VG)[G open and x £ G => (3 y e A n G)[x f y]]. 

A limit point of A may or may not be a member of A. A point of A which is 
not a limit point of A is isolated in A. 

xl0.14. Determine the limit points and the isolated points of the pointset 

B = {x £ Af | x(0) = 1 V (V«)[x(n) = 2] V (3«)[x(«) = 3]}. 

xl0.15. Prove that x is a limit point of A if and only if every open set containing 
x contains infinitely many points of A. 

xl0.16. Prove that a set is closed in a topological space X if and only if it 
contains all its limit points. 

xl0.17. Prove that a pointset P is perfect if and only if it is closed and has 
no isolated points, i.e., every point of P is a limit point of P. This equiva- 
lence identifies the specific definition of perfect pointsets we adopted with the 
classical, topological definition. 

10.42. Definition. A sequence (n i— > x„) of points in a topological space X 
converges to a point x or has x as its limit if every open set containing x 
contains all but finitely many of the terms of the sequence, 

lim„ x„ = x ^=4>df (VG open, x £ G)(3« £ N)(Vz > «)[x ( - £ G]. 

xl0.18. Prove that a point x is a limit point of a pointset A if and only if 
x = lim„ x„ is the limit of some sequence (« i— > x„ £ A) of points in A. Which 
choice principle did you use, if any? 

xl0.19. Prove that a function / : Af — > Af is continuous if and only if 

/ (lim„ x„) = lim„ / (x„), 

whenever lim„ x„ exists. Which choice principle did you use, if any? 

xl0.20. A topological space X is Hausdorff if for any two points x / y, there 
exist disjoint open sets G n H = 0 such that x e G and y € PI. Prove that 
if /. g : X — > Y are continuous functions and Y is Hausdorff, then the set 
{x e X | / (x) = g(x)} is closed in X. 

10.43. Definition. An open covering of a set K in a topological space X is any 
family & of open sets whose union includes K. K C |J 'S . A set K is compact 
in X if every open covering of K includes a finite subcovering, i.e., for every 
family S' of open sets, 

KC i)S=> (3G 0 , . . . , G n e S)[K C U G,]. 
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*xl0.21. Prove that a pointset is compact by Definition 10.18 if and only if it 
is compact by Definition 10.43. Hint. You will need Konig’s Lemma 9.7. 

xl0.22. Prove that for any two topological spaces X, Y, any continuous 
function / : X — > Y and any compact set K C X, the image f[K] is compact 
in Y. 



CHAPTER 11 


REPLACEMENT AND OTHER AXIOMS 


We have just about reached one of the goals we set in Chapter 4 , which was 
to prove all the “naive” results of Chapter 2 from the axioms of Zermelo. 
Only a couple of minor points remain, but they are significant: they will 
reveal that Zermelo’s axioms are not sufficient and must be supplemented 
by stronger principles of set construction. Here we will formulate and add 
to the axiomatic theory ZDC the Axiom of Replacement discovered in the 
early 1920’s, a principle of set construction no less plausible than any of the 
constructive axioms (I) - (VI) but powerful in its consequences. We will also 
introduce and discuss some additional principles which are often included in 
axiomatizations of set theory. Using only a weak consequence of Replace- 
ment, we will construct the least Zermelo universe Z, a remarkably simple 
set which contains the natural numbers, Baire space, the real numbers and all 
the significant objects of study of classical mathematics. Everything we have 
proved so far can be interpreted as if Z comprised the entire universe of math- 
ematical objects, yet Z is just a set — and a fairly small, easy to comprehend 
set, at that! Our main purpose in this chapter is to understand the Axiom of 
Replacement by investigating its simplest and most direct consequences. The 
real power of this remarkable proposition will become apparent in the next 
chapter. 

According to (2) of 2 . 16 , if A is a countable set and for each n > 2, 


A" = Ax ■■■ x A, 

s. v ✓ 

n times 


then the union (J ^L 2 A n ' s also countable. The obvious way to prove this from 
the axioms is to define first the sets A" by the recursion 


/( 0) = A x A. 
f(n + 1) = f(n) x A, 

so that f(n) = A n+ 2 and 

U„°V" +2 = U/[N]. 


( 11 - 1 ) 


( 11 - 2 ) 


Cantor’s basic 2.10 implies first (by induction) that each /(«) = A n+1 is 
countable, and then that their union (J “ i 0 A " + 2 must also be countable. Is 
there an error? Certainly not in the proof by induction, which is no different 
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than many others like it. There is a problem, however, with the recursive 
definition (11-1) which cannot be justified by the Recursion Theorem 5.6 as 
it stands. To apply 5.6 we need a set E, a function h : E —* E on E and some 
a € E, which then determine a unique / : N — > E satisfying 


/( 0 ) = a, 

fin + 1) = h{ f{n)). 


(11-3) 


In the case at hand there is no obvious E which contains A and all its products 
A", and instead of a function h. we have the operation 

hiX) = df X x A, (11-4) 


which associates with each set X its product X x A with the given set A. To 
justify definition ( 1 1 - 1 ) , we need a recursion theorem which validates recursive 
definitions of the form (11-3). for every object a and (unary) definite operation 
h. It looks quite innocuous, only a mild generalization of the Recursion 
Theorem — and it is just that — but in fact such a result cannot be established 
rigorously on the basis of the Zermelo axioms. 

11.1. (VIII) Replacement Axiom. For each set A and each unary definite oper- 
ation H . the image 

H[A\ = df {Hix) | x £ A} 

of A by H is a set. 

As a construction principle for sets, the Replacement Axiom is almost 
obvious, as plausible on intuitive grounds as the Separation Axiom. If we 
already understand A as a completed totality and El associates in a definite 
and unambiguous manner an object with each x £ A, then we can “construct” 
the image H[A ] by “replacing” each x € A by the corresponding Hix). 


11.2. Axiomatics. The axiomatic system ZFDC of Zermelo-Fraenkel set the- 
ory with Dependent Choices comprises the axioms of ZDC and the Replace- 
ment Axiom (VIII), symbolically 

ZFDC = ZDC + Replacement = (I) - (VIII). 

From now on we will use all the axioms of ZFDC without explicit mention 
and we will continue to annotate by the mark (AC) the results whose proof 
requires the full Axiom of Choice. 29 


Mostly we have used simple definite operations up until now, those directly 
supplied by the axioms like V( A) and (J e? and explicit combinations of them, 
e.g., the Kuratowski pair (x, y) =df {{-v}, {x, y}}. Once we assume the 
Axiom of Replacement, however, definite operations come into center stage 
and we will need to deal with some which are not so simply defined. We 
describe in the next, trivial Proposition the basic method of definition we will 
use, primarily to point attention to it. 


29 The Axiom of Replacement was introduced independently by Thoralf Skolem and Abraham 
Fraenkel in the early 1920s. 
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11.3. Proposition. Suppose C and P are definite conditions of n and n + 1 
arguments, respectively, assume that 

(Vx)[C(x) => (3 \w)P(x, «;)], (11-5) 

and let 

, f the unique w such that P(x, w), if C (x), 

F[x) =df \ 0, otherwise 

The n-ary operation F is definite. 

In practice, we will appeal to this observation by setting 

F(x) =df the unique w such that P(x, w) (C(x)), 

after we verify (11-5), without specifying the irrelevant value of F outside 
the domain we care about. The Axiom of Replacement often comes into the 
proof of (11-5). 

11.4. Exercise. For each unary definite operation F . the operation 

G(X) = df F[X] = {F(x) \ x £ X} (Set(X)) 

is also definite. 

The next fundamental consequence of the Replacement Axiom generalizes 
the Transfinite Recursion Theorem in two ways: by allowing a definite oper- 
ation instead of just a function in the statement, and by replacing the given 
well ordered set by an arbitrary grounded graph. The second generalization 
does not require the Replacement Axiom, Problem x8.11. 

11.5. Grounded Recursion Theorem. For each grounded graph G with edge 
relation — > and each binary definite operation FI , there exists exactly one function 
f : G — > f[G] which satisfies the identity 

fix) = H(f\{y £ G | x -> y},x). 

Proof. As in the proof of 7.24, we first show a lemma which gives us a set of 
approximations of the required function. Instead of the initial segments of G 
(which do not make much sense for an arbitrary graph), these approximations 
are defined here on downward closed subsets of G. Recall the definition of the 
transitive closure of a graph => G given in 6.34; we will skip all the subscripts 
in what follows, since only the single graph G is involved in the argument, and 
we will also use the inverse arrows, 

u < — t t — > u <==3- u is immediately below t, (11-8) 

x <= t 4=>df t => x x is (on some path) below t. (11-9) 

Lemma. For each node t £ G, there exists exactly one function a with domain 
the set {x £ G \ x <= t} which satisfies the identity 

<j(x) = H(a \{y £ G \ y <— x},x) (x 4= t). 


( 11 - 6 ) 


(11-7) 


( 11 - 10 ) 
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Proof. Suppose, towards a contradiction, that / is a minimal node of G 
where the Lemma fails. Thus, for each u <— t, there is exactly one function a u 
such that 


a u {x) = H(a u \{y € G \ y 

<— x},x) (x<=u). 

(11-11) 

First we notice that 



[x <= u <— t & X 4= v <- 

- ?]=><7„(x) = cr„(x); 

(11-12) 

because if x were minimal in G where (11-12) failed, then 


cr„(x) = H(o u r{y € G I y <- 
= H{a v \{y e G \ y <- 
= cr v (x) 

x},x) by (11-11), 

x } , x) by the choice of x, 

by (11-11) for er„. 



The operation u i— > a u which assigns this a u to each u <— t is definite, so by 
the Axiom of Replacement its image is a set and we can set 

o\ =df U Wn I u <- 

this o i is a function by (11-12), and by the definition, 
o\{x) i <=>■ (3 u)[x -4= m < — t]. 

By another application of the Replacement Axiom, 

°2 =df {(v, H{a v T {x | x <— u}), x) | v <— / & -i(3m)[u 4= u <— ?]} 

is also a set, and by its definition it is a function with domain disjoint from 
that of o i. Thus 

a =df o\ U <72 


is a function, and 


cr(x ) i <=>■ (3 u)[t — > u & u =» x] V t — > u 
■<=>■ t => x (by 6.35). 

Moreover, a satisfies (11-10) because rr\ and 02 do. Finally, the same argument 
by which we proved (11-12) shows that no more than one a with domain 
{x e G | x <= /} can satisfy (11-10), and that completes the proof of the 
Lemma. 3 (Lemma) 

To prove the Theorem, we apply the Lemma as in 7.24 to “the successor 
graph” 

Succ(G) = df GU{/*}, 

X -^Succ(G) y ^=^df X -*J'V [x = t* &y G G], 
which has just one more node than G. at the top. 3 

11.6. Corollary. (1) For each well ordered set U and each binary definite oper- 
ation FI . there exists exactly one function f : U — > f{U\ which satisfies the 
identity 


f{x) = H{f\ seg(x),x) (x e U). 


(11-13) 
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(2) For each object a and each definite unary operation F, there exists a 
unique sequence (n i-» a„) which satisfies the identities 

ao = a, a n+ i=F(a n ) (« £ ff). (11-14) 

We call (n i-> a n ) the orbit of a under F. 

Proof. For (1) we apply 11.5 to the graph (Field) U), >u), and for (2) to 
the graph (N, — >), where 

n — > m ^=^df n = m + 1 . H 

11.7. Exercise. Which definite operation FI do we use to prove (2) of the Corol- 
lary 1 

The orbit of a set A by the unionset operation reveals the hidden structure of 
A under the membership relation by exposing the members of A. the members 
of the members of A, the members of those, etc. ad infinitum. 

11.8. Definition. A class or set M is transitive if (J M C M , equivalently 

(Vx G M)(\/t G x)[t G M], 
or just x G M =$■ x C M. 

11.9. Exercise. The sets®, {0, {0}}, {0. {0}, {0, {0}}}. the set 

N o = {0, {0}, {{0}}- (11-15) 

postulated by the Axiom of Infinity and every class whose members are all atoms 
are transitive. 

11.10. Transitive Closure Theorem. Every set A is a member of some transitive 
set M . in fact , there is a least ( under C) transitive set M = TC(^l) such that 
A G TC(A). We call TCM ) the transitive closure of A. 

Proof. By (2) of 11.6, there is a unique sequence n i— > TC „(A) which 
satisfies the identities 


TC„U) = {A}. 

(11-16) 

TC„+i (A) = UTC n (A), 

TC (A) = d fU„TC n (A). 

(11-17) 


Clearly A G TC (A) and TC(A ) is transitive, because 

u G TC n (A)=*u C |JTC„U) = TC„+iU). 

If M is transitive and A G M. then TCo(^4) = {^4} C M, and by induction 
TC „(A) C M=>TC n+I (A) = 1JTC n (A) C \JM C M, 
so that in the end TC(^) = 1J n TC„(^) CM. H 

11.11. Exercise. If A is transitive, then TC(^4) = dU {A}. 

To understand better the remark about “revealing the hidden G-structure” 
of A, consider the following natural concepts. 
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11.12. Definition. A set A is hereditarily free of atoms or pure if it belongs to 
some transitive set which contains no atoms; equivalently, if TC(^) contains 
no atoms. A set A is hereditarily finite if it belongs to some transitive, finite set; 
equivalently, if TC(^4) is finite. A set A is hereditarily countable if it belongs 
to some transitive, countable set; equivalently, if TC(^4) is countable. 

The point of the definitions is that {{«}} is a set but not a pure set if a is 
an atom, because we need a to construct it; { N } is finite but not hereditarily 
finite because we need all the natural numbers to construct it; {A/ - } is countable 
but not hereditarily countable because we need to “collect into a whole” an 
uncountable collection of objects in J\T before we can construct the singleton 
{A/ - } by one final, trivial act of collection. Put another way. {A/ - } is not 
hereditarily countable because “its concept involves” an uncountable infinity 
of objects, the members of its sole member A f. 

11.13. Exercise. The Principle of Purity 3.25 is equivalent to the assertion that 
every set is pure. 

11.14. Exercise. A transitive set is hereditarily finite if it is finite , and heredi- 
tarily countable if it is countable. 

Next we consider the closure of a set under both the unionset and powerset 
operations. 

11.15. Theorem (Basic Closure Lemma). For each set I and each natural num- 
ber n, let M n = M„(I) be the set defined by the recursion 

Mq = I, M n+ \ = M„ U 1J M n U V{M n ). (11-18) 

The basic closure of I is the union 

M = M(/)= df U“o M »(/), (H-19) 

and it has the following properties. 

(1) M is a transitive set which contains 0 and I . it is closed under the pairing 
{x, y}, unionset (J if and powerset V(A) operations and it contains every subset 
of each of its elements. 

(2) M is the least ( under C) transitive set which contains I and is closed under 
{x, y}, (Jif andV(X). 

(3) If I is pure and transitive, then each M n is a pure, transitive set and satisfies 

M n+l =V(M„). (11-20) 

As a consequence, M is a pure, transitive set. 

Proof. (1) By the definition, 0,7 e M\ C M. If x, y e M, then from 
the obvious M n C M n+ 1 , there exists some m such that {x, y} C M m , so 
{x, y } G M m+ 1 . The key inclusion for the remaining claims is 

x € M n => x C |J M n C M n+ 1 C M, 

which implies immediately that M is transitive. It also implies 

x € M n ■ > (J x C (J M n +\ C A7 n + 2 > 
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so x G M„ => IJ .v G M n+ 3 C M and M is closed under IJ v. The same 
argument shows that M is closed under 'P(X) and the last assertion follows 
by this closure and transitivity. 

(2) If M' is closed under {x, y} and \JW, then it is also closed under 
AU B = IJ {A,B}: and if M' is also closed under V{X), then a simple 
induction shows that M„ G M' for each n. and so M C M' by the transitivity 
of M' . 

(3) If / is transitive with no atoms, then every M„ is transitive and has 

no atoms by a trivial induction on n. This implies that M is a transitive set 
with no atoms and hence pure, but also that M n U J M n C V{M n ), so that 
M n+ \ = V(M n ). H 

11.16. Exercise. True or false: for every transitive set X , X C V{X). 

11.17. Exercise. If I C /, then for each n, M n (I) C M n (J), and hence 

I c J=^M{I) C M(J). 

11.18. The grounded, pure, hereditarily finite sets. The least basic closure is 
that of the empty set. M(0) C M(7). for every I. In the classical notation 
(which we will explain in the next chapter), M„(0) = V„, so that each V„ and 
their union are determined by the identities 

Vo = 0, V n+ i = V(V n ), J= d fU “ 0 V„ = M(0). (11-21) 

Forexample,0 G Fi.{0} G V 2 and {{0}. {{0}}} G V 4 ! These sets are pictured 
on the left in Figure 11.1. The set V co is grounded, pure and transitive, each 
V n is finite by an easy induction, so every set in V w is grounded, pure and 
hereditarily finite, and V m itself is countable. These are the sets which can 


30 Picturing universes of sets by cones like this is traditional but misleading; the successive 
powersets grow hyperexponentially in size, so it would be more accurate to draw a cone with 
curved, hyperexponential sides. 
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be constructed “from nothing” (literally, the empty set) by iterating any finite 
number of times the operation of collecting into a whole (putting between 
braces) some of the objects already constructed. 

The closure properties of M(I) itemized in (1) of 11.15 are precisely those 
required of the universe W by axioms (II) - (V), as we discussed them in 3.26, 
and if the set No of (11-15) demanded by the Axiom of Infinity (VI) is a subset 
of I, we also have N 0 € M(I). Notice also that since M(I) is transitive, for 
A,B <E M(J), 

A B => {3t € M(l))[t €(A\B)U(B\ A)], (11-22) 

which says of M(I) what the Axiom of Extensionality demands of VV by 
(3-12). This suggests that if we take “object” to mean “member of Af (/)”, for 
any / D No, then we can reinterpret every proof from the axioms (I) - (VI) 
as an argument about the members of M{I) instead of all objects, in the end 
proving a theorem about M(I) instead of W. It is an important idea, worth 
abstraction and a name. 

11.19. Definition. A transitive class M is a Zermelo universe if it is closed under 
the pairing {x, y}, unionset (J % and powerset V{X) operations, and contains 
the set No defined by (11-15). The least Zermelo universe is Z = M{ No), 
determined by the identities 

Z 0 = No, Z n+l =V(Z n ), Z = {JZ 0 Z„. (11-23) 

11.20. Exercise. The class W of all objects is a Zermelo universe. Every Zer- 
melo universe contains the empty set as well as every subset of each of its 
members. 

A Zermelo universe M is a model of the axioms (I) - (VI) , and a very special 
model at that, since it interprets standardly the basic relations of membership 
and sethood — it only restricts the domain of objects in which we interpret 
propositions. The claim that logical consequences of (I) - (VI) are true in 
every Zermelo universe is called a metatheorem, a theorem about theorems. To 
make general results of this type completely precise and prove them rigorously 
requires concepts from Mathematical Logic. In specific instances, however, 
lemma by lemma and proposition by proposition, it is quite simple to see 
what the specific consequence of the axioms means for an arbitrary domain of 
objects and to verify it in every Zermelo universe: this is because, in fact, we 
have been using the axioms as closure properties of the universe, about which 
we have assumed nothing more but that it satisfies them. 

11.21. Proposition. Every Zermelo universe M is closed under the Kuratowski 
pair operation (x, y) defined in (4-1), as well as the Cartesian product A x B. 
function space (A — > B), and partial function space (A — >• B) operations, 
provided these are defined using the Kuratowski pear. In addition, if A £ M and 
~ is an equivalence relation on A, then ~ and the quotient [[A/~] defined in 
4.12 are also in M . 
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Proof. The Kuratowski pair (x, y) = {{x}, {x, y}} of any two members 
of M is obtained by taking unordered pairs twice, so it is certainly in M. If 
A. B £ M, then A U B = 1J {A. B}, and by the proof of 4.2, 

Ax B C V{V{AUB)) £ M, 

so A x B £ M . The remaining claims are proved similarly. H 

11.22. Exercise. Every Zermelo universe M contains a Peano system as defined 

in 5.1. 

11.23. Proposition. (1) The Axiom of Dependent Choices is true in every Zer- 
melo universe M , in the following sense : if a £ A £ M, P C A x A, P £ M 
and N £ M is a system of natural numbers in M , then 

a £ A&(\/x £ A)(3y £ A)P(x,y) 

=> (3/ : N A)[f £ M&/( 0) = a & (Vn £ N )P(f(n),f(n + 1))]. 

(2) (AC) The Axiom of Choice is true in every Zermelo universe M , in 
the following sense, following 8.4: for every family £ M of non-empty and 
pairwise disjoint sets, there exists some set S £ M which is a choice set for "S , 
i.e., 

sc \jg, (vxer)(3 u)[snx = {u}]. 

Proof. (1) The hypothesis of the implication to be proved implies by DC 
that there exists some function / : N — ■> A such that / (0) = a and for every 
n £ N, P(f(n),f(n + 1)). Since (N — > A) £ M, we also have f £ M by 
transitivity. 

Part (2) is proved similarly. H 

Although we chose specific versions of the choice principles to simplify 
these arguments, their numerous equivalents are also true in every Zermelo 
universe M. This can be verified directly, or by observing that the equivalence 
proofs we have given can be “carried out within M”. 

11.24. Exercise. (AC) If M is any Zermelo universe, A. B £ M, and P is any 
binary definite condition, then 

(fJx £ A)(3y £ B)P(x,y) 

=►(3/ £ M)[f : A —> B & (Mx £ A)P(x,f(x))]. (11-24) 

11.25. The least Zermelo universe Z. Let us concentrate on the least Zermelo 
universe Z, to focus the argument. It is a pure, transitive set, constructed by 
starting with the simple set No and iterating the powerset operation infinitely 
many times, much as we construct the natural numbers starting with 0 and 
iterating infinitely many times the successor operation. We can think of the 
sets in Z as precisely those objects whose existence is guaranteed by the axioms 
(I) - (VI) . The natural interpretations of DC and AC are also true in Z, the 
latter under the assumption that AC is true in W. Using the closure properties 
of Zermelo universes already established and looking back at Chapters 5, 6 
and 10 and ahead at Appendix A, we can verify that Z contains not only the 
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specific system of natural numbers N we constructed in Chapter 5, but also 
the Baire space A f defined from this N and the specihc systems of rational and 
real numbers constructed in Appendix A. By the uniqueness results, any one 
of these systems is as good as any other, so we can say that Z contains the 
integers, Baire space, the rationals and the reals. 

Combining these remarks with some knowledge of classical mathematics, 
it is not hard to give a convincing argument that all the objects studied in 
classical algebra, analysis, functional analysis, topology, probability, differential 
equations, etc. can be found (to within isomorphism) in Z. Many fundamental 
objects of abstract set theory are also in Z, all we have constructed before this 
chapter in developing the theory of inductive posets, well ordered sets, etc. In 
slogan form: we can develop classical mathematics and all the set theory needed 
for it as if all mathematical objects were members of Z. 

The same can be said of every Zermelo universe, of course, but the concrete, 
simple definition of Z makes it possible to analyze its structure and investigate 
the special properties of its members. For example, no set which is a member of 
itself belongs to Z\ because no X £ N 0 satisfies X £ X (easily), and if n were 
least such that some X £ X £ Z n+ \ , then X £ X C Z„ by the definition, so 
X £ Z n , contradicting the choice of n. This looks good, we had some trouble 
with sets which belong to themselves. Actually, the iterative construction of 
Z ensures a much stronger regularity property for its members, discovered by 
von Neumann. 

11.26. Definition. An object x is ill founded if it is the beginning of a descend- 
ing e-chain, i.e., if there exists a function / : N — > E such that 

X = /(0)9/(l)9/(2)9 ••• . 

Objects which are not ill founded are well founded or grounded. If A £ X , 
then X 9 X 9 X 9 • • • , so X is ill founded. Problem xll.14 gives a simple 
characterization of ill founded sets directly in terms of the £ relation, which 
suggests that ill foundedness is a generalization of self-membership. 

11.27. Exercise. Atoms are grounded, as is 0 and No- A set is grounded if and 
only if cdl its members are grounded, if and only if its powerset is grounded. The 
class of all grounded sets is transitive. 

11.28. Proposition. If I is grounded, then so is its basic closure M(I). In 
particular, the least Zermelo universe Z and all its members are grounded. 

Proof. Assume that / is grounded, let (towards a contradiction) n be least 
such that M„ is ill founded and suppose that M„ 9 xi 9 • • • is a descending £- 
chain. By hypothesis n > 0. Since xi £ M n _\ and x\ £ y £ M n _ \ contradict 
the choice of n, we must have x\ C M„_i, so xi £ M„_i and the descending 
G-chain M„_i 9 xi 9 • • • contradicts again the choice of n. It follows that 
M is also grounded, since any descending chain M 9 x\ 9 • • • would also 
witness that M n is ill founded for whatever M n contains x\. The consequence 
about Z follows because N 0 is grounded. H 
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Figure 11.2. {0. {0}} as a disappointing gift. 

In the old gag. the excited birthday boy opens up the huge box with his 
present, only to find inside it another box, and inside that another, and so on, 
until the last, tiny box is empty: his present is just the boxes. We can think of 
a pure, grounded set as a disappointing gift of this sort, except that each box 
may contain several boxes, not just one; no matter which one the birthday 
boy chooses to open up each time, eventually he finds the empty box, 0. Most 
axiomatizations of set theory ban ill founded sets from the start by adopting 
the following principle proposed by von Neumann. 

11.29. Principle of Foundation. Every set is grounded. This is also called the 
principle (or axiom) of Regularity in the literature. 

It is worth putting down here an equivalent version of this Principle, which 
is somewhat opaque but useful. 

11.30. Proposition. The Principle of Foundation is true if and only if for every 
non-empty set X , there is some m £ X such that 

m nl = 0. (11-25) 

Proof. Assume first the Principle of Foundation and suppose, towards a 
contradiction, that X f 0 but no m £ X satisfies (11-25). This means that 
for some a , 

a £ X & (Vm £ X){3t £ X)[t £ m], 

and then DC gives us an infinite descending G -chain beginning with X 9 a 
which contradicts the hypothesis. Conversely, if the Principle of Foundation 
fails and some infinite descending e -chain starts with some set 

X = f(0) 3 f(l) 3 f(2) 3 

then the set /[N] = {/ (0), /(l), . . . } is not empty and intersects each of its 
members, so none of them satisfies (11-25). H 

11.31. Zermelo-Fraenkel set theory (with choice), ZFC. By far the most widely 
used — the “official” — system of axioms for sets is the Zermelo-Fraenkel The- 
ory (with choice), which accepts the Principles of Purity 3.25 and Foundation 
in addition to those of ZFDC and the Axiom of Choice, symbolically, 

ZFC = ZFDC + AC + Purity + Foundation. 
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Figure 11.3. Cl as the ultimately frustrating gift. 

There are many and convincing arguments in favor of this industry standard, 
some of which we discuss immediately below. We will come back to the 
question in Chapter 12 and Appendix B, where we will also explain a few 
good foundational reasons for sticking with the weaker ZFDC in these Notes. 
As a practical matter, the principles of Purity and Foundation do not come 
up in the part of the subject we are covering, and the full AC is only needed 
rarely, so we can easily keep track of it. 

11.32. Are all sets grounded? The most blatant exception to the Principle of 
Foundation would be a set which is its own singleton, 

0 = {0}. (11-26) 

We can think of Cl as the ultimately frustrating gift: each box has exactly 
one box inside it, identical with the one you just opened, and you can keep 
opening them forever without ever finding anything. How about sets O 1 and 
Cl 2 such that 

Q'={0,O 2 }, £2 2 = {fl 1 }? (11-27) 

These equations look unlikely, even bizarre, but it is not clear that our axioms 
rule them out. As a matter of fact they do not: we will construct in Appendix 
B some quite reasonable models of ZFDC+AC which contain some Cl = {£2} 
and many other sets with similar properties. 

Recall the discussion about the large and the small heuristic views of the 
universe of objects W in 3.26. The large view conceives VV as the largest 
possible collection of objects which satisfies the axioms, while the small view 
takes it to comprise just the objects guaranteed by them. 

On the large view, we have no more evidence in favor or against the Principle 
of Foundation now than we did back in Chapter 3, except that we have proved 
all these things about sets without ever using it. But then again, we never saw 
a need for ill founded sets either. 

On the small view, we have amassed some considerable evidence, at least 
for ZDC+AC, and it is all in favor of the Principle of Foundation: we now 
have a precise idea of what sets are “guaranteed” by the axioms of ZDC+AC, 
they are the members of Z and they are all pure and grounded. It may 
be argued that we did not build Z out of whole cloth, we worked within a 
“given” universe W of objects: in fact, we needed to assume that VV satisfies 
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the Axiom of Replacement in addition to the axioms of ZDC+AC. This is 
certainly true, but so is the obvious response to it: aside from any rigorous 
axiomatization, the definition of Z and the proofs of its basic properties 
can be understood intuitively, naively, and they carry considerable force of 
persuasion. An informal description of Z would have made perfect sense in 
Chapter 3, as an intuitive conception of “restricted set” which justifies the 
axioms of ZDC+AC and the principles of Purity and Foundation. We have 
not been able to produce any such plausible, intuitive model of ZDC which 
contains ill founded sets from any hypotheses which do not beg the question. 31 

Could we construct simple models like Z for the theories ZFDC and ZFC? 
Let us first give them a name. 

11.33. Definition. A ZFDC-universe is any Zermelo universe M which further 
satisfies the Axiom of Replacement in the following sense: for each A G M 
and each unary definite operation H, 

(Vx G M)[H(x) G M] => H[A\ = { H(x ) \ x G A} G M. 

A ZFC-universe is any ZFDC-universe M which contains no atoms and 
such that every set A £ M admits a well ordering in M. 

The axioms of ZFDC assert precisely that the class W of all objects is a 
ZFDC -universe, and, correspondingly, the axioms of ZFC claim that W is a 
ZFC -universe. 

11.34. Theorem. The von Neumann class 

V — df {A | A is a pure, grounded set} (11-28) 

is a ZFDC-universe ; and if AC holds , then V is a ZFC-universe. 

Proof. The fact that V is a Zermelo universe is quite trivial, most of it 
following from Exercise 11.27. To verify that V also satisfies the Axiom of 
Replacement, notice that (whether A G V or not), if H is unary, definite and 
such that for every x G A. the value // (x) is a pure, grounded set, then the 
image H[A ] has only pure and grounded members, so it is (easily) pure and 
grounded. 

The second claim follows immediately, because if AC holds, then every 
+ G V admits a well ordering <^G V{A x A) and <^G V. H 

There is another, elegant and useful characterization of the pure grounded 
sets which follows easily from the Grounded Recursion Theorem 11.5. 

11.35. Definition. A Mostowski surjection or decoration of a graph G with 
edge relation — > is a surjection d : G — » d[G] which assigns a set to each node 
of G such that 

d(x) = {d{y) | y <— x} (x G G ). (11-29) 


31 Models like M ( No U £2) beg the question, because they need some Q with the requisite 
self-membership property to get started. 
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Figure 11.4. Mostowski collapsing; d G [G] = { 0 , { 0 }. {{ 0 }}. {{ 0 }, {{ 0 }}}}- 

11.36. Theorem (Mostowski Collapsing Lemma). (1) Every grounded graph G 
admits a unique decoration do, and its image d G [G] is a transitive, pure, grounded 
set. 

(2) A set A is pure and grounded if and only if there exists a grounded graph 
G and a node x e G, such that A = d G (x) for the unique decoration da of G. 

Proof. (1) The existence of a unique decoration of a grounded G follows 
immediately from the Grounded Recursion Theorem 11.5 applied to G, with 
the definite operation 

H(f) = Image)/) = {/(x) | /(x) }}. 

The image d G [G ] is transitive, since if s £ t £ d G [G], then s e t = d G (y) for 
some y € G, and then s = d G (x) for some x <— y, so s e d G [G ]. Since each 
d G (x) is a set. by (11-29), d G [G ] is a transitive set with no atoms and hence 
pure. Finally, d G [G] is grounded, because if x 0 9 xj 9 • • • were an infinite, 
descending e-chain in it and so, si, ... were chosen so that d G (sj) = x,-, then 
so — > Ji — ► • • • would be an infinite descending chain in the grounded graph 
G. 

(2) If A = d G (x) with x a node in some grounded graph, then A is a member 
of a transitive, pure, grounded set by (1) and hence pure and grounded. For 
the converse, let G = TC(^) be the transitive closure of A and define on it 

x — ► y 4=>df y G x. 

The graph G is grounded, because G = TC(A) is a grounded set, Problem 
xll.16. In addition, 

d G (x) = x (x € G); (11-30) 

because if x were a G-minimal counterexample to (11-30), then 
d G (x) = {d G (y) | y G < x}. 

= {y | y <— x} by the choice of x, 

= {y | y ex} by the def. of — >, 

= x because x is a set. 

In particular, A = d G (A), which proves (2). H 
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The class V is not a set (Problem xll.20) and it is quite hard to find ZFDC- 
universes which are sets. See Problems xl2.43, xl2.44 and xB.12. 

11.37. Consistency and independence results. All the consistency and indepen- 
dence results we have discussed in 8.24, 8.25, 10.33, 10.34, 10.36 and 10.37 can 
be strengthened by adding the Axiom of Replacement to the relevant theories. 
This is as good a place as any to collect the most general versions of these 
fundamental results, which are outside the scope of these Notes. 

(1) (Godel, 1939) The universe L of constructible sets is a model of ZFC, 
which further satisfies the Generalized Continuum Hypothesis GCPI. It follows 
that the Axiom of Choice AC cannot be refuted from the other axioms of 
ZFC, and that GCH cannot be refuted in ZFC. 

(2) (Cohen, 1963) None of the choice principles ACn, DC and AC can be 
proved from a weaker one using the constructive axioms of Zermelo (I) - (VI) 
and the Axiom of Replacement (VIII). 

(3) (Cohen. 1963) There is a model of ZFC in which the Continuum Hypoth- 
esis CH is false, so CH cannot be proved in ZFC. 

(4) (Solovay, 1970) There is a model of ZFC in which every “definable” , 
uncountable pointset has a perfect subset, and hence has cardinality c. This 
means in particular, that we cannot define a specific pointset A and then prove 
in ZFC that it has cardinality intermediate between H 0 and c. 

(5) (Solovay, 1970) There is a model of ZFDC in which every uncountable 
pointset has a perfect subset, so we cannot prove in ZFDC the existence of 
uncountable pointsets without perfect subsets. Solovay’s model also satisfies 
the Principles of Purity and Foundation. 


Problems for Chapter 1 1 

xll.l. Prove the Separation Axiom (III) from the remaining axioms in the 
group (I) - (V) and the Axiom of Replacement (VIII). 

xll.2. For each set A. each unary, definite operation F and each binary, 
definite operation G, there exists a least under C set A which contains A as a 
subset and is closed under F and G, i.e., 

A C A, x € A ==> F(x) € A. x, y € A ==> G(x, y) G A. 

(The same is true for any number of operations, of any number of arguments.) 

xll.3. The Axiom of Replacement is constructively equivalent with the fol- 
lowing proposition: for every set A and every unary definite operation F. 
there exists a set B which contains A and is closed under F , i.e., 


A C B&F[B] C B. 
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xll.4. The Axiom of Replacement is constructively equivalent with the fol- 
lowing proposition: for every set A and every unary definite operation F , the 
restriction 

F\A =df {(x,E(x)) | x £ A} 

of F to A is a function, i.e., a set of pairs. 

xll.5. If (x, y ) is a definite, binary operation which satisfies the first property 
of ordered pairs (OP1) in 4.1, then it also satisfies the second, (OP2). (This 
cannot be proved in ZDC+AC, see Problem xB.4.) 

xll.6. If | A | is a definite operation which satisfies the first condition on weak 
cardinal assignments (Cl), then it automatically also satisfies the third one, 
(C3). (This cannot be proved in ZDC+AC, see Problem xB.8.) 

xll.7. The definite condition of functionhood defined in (4-14) satisfies the 
equivalence 

Function(/) +=+ Set(/) & (Vic € /)(3x, y)[w = (x, y)] 

& (Vx, y, /)[[(.+ y) e / & (x, y') €/]=*• y = /], 

i.e., / is a function exactly when it is a single-valued set of pairs. (See also 
Problem xB.9.) 

xll.8. There exists a sequence (n i— > H„) which satisfies the identities 

No=|N|, H„ +1 = H+. 

We introduced these names for the first few infinite cardinals in (9-6), but this 
is not the same as proving the existence of the sequence ( n i— > H„). (See also 
Problem xB.10.) 

xll.9. Extended recursion with parameters. For every unary definite opera- 
tion G and every ternary definite operation //. there exists a unary definite 
operation F which satisfies the identities 

F(0,y) = G(y), 

F(n + 1, y) = F[(F(n. y), n. y). 

xll.10. If g 3 is a non-empty family of transitive sets, then the union (J W and 
the intersection f) W are also transitive. 

xll.ll. For every class A there exists a least, transitive class A which contains 
A, that is, such that A C A and for every transitive class B, A C B ==> A C B. 

xll.12. The class of all pure sets is transitive, as are the classes of hereditarily 
finite and hereditarily countable sets. 

xll.13. If x i e x 2 e • • • e x n = X\ , then xi is ill founded. 

xll.14. An object x is ill founded if and only if there exists some set A such 
that 


x e A & (Vj € A){3t e A)[s 9 t]. 
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xll.15. If Q 1 and £> 2 exist which satisfy (11-27), then they are distinct, hered- 
itarily finite, pure sets. 

xll.16. A set A is grounded if and only if its transitive closure TC(A) is 
grounded. 

*xll.l7. A set is in V m if and only if it is pure, grounded and hereditarily finite. 
Hint. Show first that every finite, transitive, pure grounded set is in V w . 

xll.18. For each transitive set I, let 

J = {x € I \ xis pure and grounded} 

and prove that 

M{J) = {x G M(I) | x is pure and grounded}. 
xll.19. If a set £1 = {£1} exists as in (11-26), then 

{x G Af(fi) | x is grounded} = V m . 
xll.20. Prove that the class V of all pure, grounded sets is not a set. 

11.38. Definition. A class K of atoms supports a set A if 
x G TC(^4) & Atom(.v) => x G K, 

and we let 

W[K] = {x | x is supported by AT}, (11-31) 

V[K] = {x | x is grounded and supported by K}. (11-32) 

xll.21. The class W[K] of sets supported by a class of atoms K is a ZFDC- 
universe. 

xll.22. The class V[/f] of grounded sets supported by a class K of atoms is a 
ZFDC -universe. 

xll.23. Let G = (N, — >) where m — > n n < m. Show that G 

is grounded and compute the image d G (m ) of each m under the unique 
decoration of G. 

xll.24. Give an example of an infinite, grounded graph G with d G [G] = {0}. 
*xll.25. Let G = {N\ {0}, -►), where 

m — > n •<=>• m ^ n& n divides m. 

Show that G is grounded and compute the image d G (m ) of each m under the 
unique decoration of G. 

xll.26. An extended decoration of a graph G is any surjection d : G — » d[G] 
such that for all x G G, 

,, \ _ f x, if x is an atom, 

X \ {d(y) | y <— x}, if x is a set. 
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Prove that every grounded graph admits a unique extended decoration and 
that a (not necessarily pure) set A is grounded if and only if there exists a 
grounded graph G and some x € G, such that A = d (x). 

*xll.27. Grounded (E-recursion. For each binary definite operation //. there 
exists a definite operation F(t), such that for every grounded set x, 

F{x) = H{F \x, x), 

where the function F \ x = {(t, F(t)) \ t e x} is the restriction of the 
operation F to the set x. 

This is a special case of the next, slightly more complex generalization of 
the Grounded Recursion Theorem 11.5. 

*xll.28. Suppose t < x is a binary definite condition which satisfies (1) for 
every x, the class {/ | t < x} is a set, and (2) there does not exist a sequence 
(n i— > x„) such that for all n, x„+i < x n . Prove that for every definite, binary 
operation II there exists another F, such that for every x, 

F(x) = H{F ({/ | t < x},x), 

where the function F \ {/ | t < x} = {(t, F(t)) \ t < x} is the restriction of F 
to the set {t \ t < x}. 
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The Axiom of Replacement finds its most important applications in von Neu- 
mann’s beautiful theory of Ordinal Numbers, and in the construction of the 
Cumulative Hierarchy of pure, grounded sets. One can live without know- 
ing the ordinals, to be sure, but not as well: they bring many gifts, among 
them true cardinal numbers which give substance to the “virtual” theory of 
equinumerosities with which we have been making do. The Cumulative Hier- 
archy extends the iteration of the power operation we have used to construct 
V co “as far as it will go” and presents the pure, grounded sets as the most 
compelling intuitive understanding of what sets really are. It is not so clear 
one can live without knowing that, not among set theorists, at any rate. 

Cantor describes his conception of “ordinal types” just a few pages after 
the definition of cardinals quoted in 4 . 19 , and in a very similar vein. 

Every ordered set U has a definite ’ordinal type’, . . . which we will 
denote by U. By this we understand the general concept which 
results from U if we only abstract from the nature of the elements 
u. and retain the order or precedence among them. Thus the 
ordinal type U is itself an ordered set whose elements are units 
which have the same order of precedence amongst one another as 
the corresponding elements of U, from which they are derived by 
abstraction. ... A simple consideration shows that two ordered 
sets have the same ordinal type if, and only if, they are similar, 
so that of the two formulas U = 0 V, U = V, one is always a 
consequence of the other. 

Cantor is speaking about arbitrary linearly ordered sets, but we will consider 
here only the problem of defining “ordinal types” for well ordered sets. He 
states explicitly the first key property 

U= 0 U (12-1) 

of the ordinal assignment operation, and argues for 

U= 0 V=>U=V. (12-2) 

Cantor’s implied “simple consideration” for (12-2) should also justify (for 
well ordered sets) the stronger implication 

U < 0 V =>77 C ~V: 
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0 {0} {0. {0}} {0.{0}.{0,{0}}} ••• co ®U{®} 

Figure 12.1. The von Neumann map of U : 0u, lu, 2u, .... cou, S u{cou)- 

because the position of a point x in a well ordered set depends only on the 
points preceding it, so the “unit” x abstracted from x and coding its place in 
U should depend only on the initial segment segt/(x). Thus, the problem of 
representing Cantor’s conception of ordinals in axiomatic set theory comes 
down to the following: can we assign a well ordered set U to each well ordered set 
U, so that (12-1) and (12-3) hold? Von Neumann’s ingenious idea is to define 
U by replacing recursively each member of U by the set of its predecessors. The 
construction is captured exactly by the Mostowski Collapsing Lemma 11.36. 

12.1. Ordinal numbers. The von Neumann map of a well ordered set U is the 
unique decoration of the associated grounded graph (Field) 17), >u), so that 
by (11-29), 

vu(x) = {vi/GO | y <u x}, (x G Field) C/)). (12-4) 

We define the ordinal number of U to be the image 

ord( U) = df vy [Field ( U)] (12-5) 

of its von Neumann map, and we set 

ON(a) a G ON 4=^ (3 well ordered U)[a = ord(C)]. (12-6) 
Suppose for example that 

U : 0c/, 1*7, 2c/, . . . ,cou, Su(a>u) 

is a well ordered set with least element 0 u, next 1 u, ... , first limit point 
cou, followed by the last (largest) point SV(®u). We compute the values of 
its von Neumann surjection by repeated applications of (12-4) (skipping the 
subscript): 


v(0f/) = 

= {v(x) 

1 * 

< 

0u} 

= 0 

= 0, 

Kit/) = 

= {v(x) 

X 

< 

It/} 

= {0} 

= 1, 

v(2u) = 

= {v(x) 

X 

< 

2 u] 

= {0-{0}} 

= 2, 

v(3 u) = 

= {v(x) 

X 

< 

3 u] 

= {0.{0},{0.{0}}} 

= 3, 

v(cou) = 

= {v(x) 

1 * 

< 

cou} 

= {0.{0},{0.{0}},.. 

. } = co 

{Su(cou)) = 

= {v(x) 

X 

< 

Suicou)} 

= co U {co} 



v[U] 

1 = 

{0 

', 1 , 2 ,... . 

co, co U {w}}. 



These computations are special cases of the following, general facts: 
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n 

U 


V 


vu 


Vv 


' id{x) = x ' 

ord((7) ► ord(F) 

Figure 12.2. The von Neumann map under initial similarities. 


12.2. Exercise. IfOu is the least point in a well orderedset U, then vu(Ou) = 0; 
and if S(x) is the successor of x in U, then 

vu(S{x)) = v v (x) U {vc/(a)}. 

12.3. Exercise. If x is a limit point in a well ordered set U , then 0 G vjj{x) and 

a G v v {x) =>■ a U {a} G v v {x). 

12.4. Exercise. If cou is the first limit point in a well ordered set U , then 

co = vu(cou) = Pi I 0€l& (Va G X)[a U {a} G 37]}, (12-7) 

and , in particular, co = Vu(<x>u) is independent of the particular well orderedset 
U used to compute it. 

This exercise makes clear about ca v what is evident about 0 ( - , 1 . in 

Figure 12.1: the value v v (x) is independent of the particular element x G U, 
and depends only on the place of x in U, whether it is the first element, the 
fifth, the first limit point or whatever. This is a general fact about the von 
Neumann map. which we can make precise as follows. 

12.5. Lemma (First Ordinal Property). If n : U >-» n[U] C V is an initial 
similarity from U into another well ordered set V , then the diagram in Figure 
12.2 commutes, i.e., 

v v (n(x)) = vu(x) {x G U). (12-8) 

Proof. Towards a contradiction, let x be the least element of U such that 
vy(n(x)) f vu(x), and compute: 

vy(n(x)) = {vy(y) | y < v n(x)} by definition, 

= {vv(n(t)) | t <u x} because n is initial, 

= {vtr (?) | t <u x} by the choice of x, 

= Vu(x), 

contradicting the choice of x. The key step here is the second one, where we 
used the fact that an initial similarity “has no gaps” in its image, so that each 
y < v 7i(x) is 7 z(t) for some t <u x. H 

12.6. Exercise. For any two well ordered sets U, V, 

U < 0 V => ord( U) C ord(F); 
and so, ifTJ = 0 V, then ord({7) = ord( V). 
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12.7. Lemma (Second ordinal property). For each well ordered set U and each 

x e u, 

Vu(x) = ord(seg£/(x)). (12-9) 

As a consequence , each von Neumann value vu(x) is an ordinal number, and 
conversely, each ordinal number a is the von Neumann value Vu(x) of some point 
in a well ordered set. 

Put another way: each member of an ordinal is an ordinal and every ordinal 
is a member of an ordinal. 

Proof. If we apply Lemma 12.5 with seg f -(.v) for U , U for V and the 
identity initial similarity n : seg ( /(v) >— > XJ , we get: 

v se gc ,(x)(j) = •wwM.y)) = *v{y) (. y <u x), 

so that, in the end 

vu(x) = WOO I y < u x} = {v segc , (x) (y) | y < v x } = ord(seg £/(*)). 

For the second claim, let V = Succ(t7) be the next well ordered set to U , 
with t added on top. as in 7.16: now XJ = seg^M, and so ord(C/) = vy(t). H 
Thus, we can think of ordinal numbers as standing either for lengths of well 
ordered sets, or for places of points in a well ordered set. The latter agrees 
more with the use of ordinals in ordinary language, where “first”, “second”, 

. . . customarily describe the place of objects in a sequence. 

12.8. Exercise (The finite von Neumann ordinals). If <n is the usual ordering 
of the set N of natural numbers, then 

ord(N, <n) = co, 

as this is defined in (12-7). Moreover, if we set 

S(o(n) = n U {n} (n £ co), 

then (co, 0, S 0) ) is a Peano system, and : N >— » co is the unique (by Theo- 
rem 5.4) isomorphism of the natural numbers with co. 

It is usual in advanced set theory to take (co, 0, S) as the Peano system we 
fixed in 5.9, i.e., to identify N with co. This is sometimes convenient, but 
neither necessary, nor especially natural — it is hard to argue that {0. {0}} is 
a better representation of the number 2 than Zermelo’s {{0}} which comes 
from the proof of Theorem 5.3, or the third member of any Peano system. 
Next comes the basic fact about ordinal numbers. 

12.9. Lemma (Third ordinal property). Each ordinal number a is well ordered 
by the relation 

u < a v w = v V u € v (u, v £ a); (12-10) 

and if a = ord((7) for a well ordered set U, then the von Neumann map 
Vu : U — » a is a similarity. 

It follows that every well ordered set is similar with an ordinal number, and 
every well orderable set is equinumerous with an ordinal. 
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Proof. Notice first that by Lemma 11.36, each a = Vf [ t/] is a transitive, 
pure and grounded set. and (writing v for vu), 

x <u y=^v{x) G v(y). (12-11) 

Moreover, 

X ± y => y(x) f v(y), (x, y G U)\ 

because if x ^ y, then either x < v y or y < v , so that by (12-11), either 
v(x) G v(y) or v{y) G v(x); and, in either case, we can’t have v(x) = v(y) 
since a is grounded. Thus v : U >-» a is an injection, hence a bijection, and 
the relation <„ is the image of <u by v, i.e., 

X <U }’ <=> v(x) <a v{y) (x,y G U); 

so < a well orders a, and v is a similarity. 

The last claim follows because similarities are bijections. H 

This remarkable result says, in effect, that there exist sufficiently long G- 
chains to mirror every wellordering, and it is a characteristic consequence 
of the Replacement Axiom. As we have been doing with structured sets 
throughout, by “the ordinal a” we will mean ambiguously the set a or the 
well ordered set (a, < a ), so that, for example. Lemma 12.9 is expressed simply 
by 

U = 0 ord (U). 

12.10. Exercise. For every ordinal number a, ord(a) = a. 

12.11. Corollary (Characterization of ordinals). A set a is an ordinal if and 
only if it is transitive, pure, grounded and G- connected , i.e., 

ON (a) [x = y V x G y V y G x] {x, y G a). 

Proof, Every ordinal has these properties, by Lemmas 11.36 and 12.9. For 
the converse, suppose A is transitive, pure, grounded and G-connected, and 
set (as if A were an ordinal) 

x <a y <==>■ x = y V x G y, (x, y G A). 

This relation (easily) well orders A, so let v A : A >-» ord(A, < A ) be its von 
Neumann map. We now claim that v A is the identity map; if not, let x be 
<, 4 -least such that v A (x) f x, and compute; 

v A {x) = {v(y) | y < A xj 

= {y | y < A x} (by the choice of x) 

= {y | y G x} (by the definition of < A ) 

= x (because x has no atoms) , 


contradicting the choice of x. Thus ord(A,<^) = v a [A] = A, and A is an 
ordinal. H 
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This characterization of ordinal numbers is especially simple when we as- 
sume that all sets are grounded and pure, as we do in ZFC: a set then is an 
ordinal exactly when it is transitive and €- connected . 

The three basic ordinal properties also give a strong solution to Cantor’s 
problem of defining ordinal types for well ordered sets, as we formulated it 
in (12-1)- (12-3). 

12.12. Theorem. The definite operation U i— > ord(f7) on well ordered sets 
satisfies the following conditions : 

U = 0 ord(C/), (12-12) 

U < 0 V => ord( U) C ord( V), (12-13) 

ON(a) => a = {p e ON | p < 0 a}. (12-14) 

Proof. The first property (12-12) is a restatement of Lemma 12.9. 

To prove (12-13), suppose 7 1 : U >-» n[U] C V is an initial similarity. By 
Lemma 12.7. taking images, 

v v [n[U}} = v v [U] = ord(C); 

and since v v is a similarity of V with ord( V), it carries initial segments onto 
initial segments, so that 

ord({7) = v v [U ] = v v [n[U}} C ord(F). 

Finally, for (12-14), if a = ord(C7), then: 

« = W0>) I y g u] 

= {ord(seg [/(>>)) | y € U} (by Lemma 12.7) 

= {/? € ON | [1 < 0 a}, 

the last because the well ordered sets which are < 0 U are exactly those similar 
with the proper initial segments of U. H 

12.13. Exercise. For any two well ordered sets U V , 

U = 0 V 4=4- ord(C) = ord(F), 

and so for each well ordered set U, there is exactly one ordinal number a such 
that U = 0 a. 

Conditions (12-12) and (12-13) are precisely Cantor’s (12-1) and (12-3). 
The key, last condition (12-14) is characteristic of the von Neumann ordinal 
assignment. Problem xl2.4. This is an interesting result; we formulated it 
as a problem only because it makes for a good one, and we will not need to 
appeal to it. However, it is easy to get lost in proving scores of elementary 
properties of ordinals, some useful, others just challenging, and the proofs 
from the definition are a bit confusing: it is not entirely natural to think 
of the membership relation as an ordering. It is good practice, at least in 
the beginning, to prove properties of ordinals directly from their three, basic 
properties isolated above, which summarize their most basic features. 
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12.14. Lemma (Ordinal comparison). For any two ordinals a, fi. 

a < 0 f> -£=>■ a = fi V a G fi •<=>■ a C fi a C /?. 

Proof. We give a round-robin argument of the strict versions of the claimed 
equivalences. 

(1) a < 0 /? => a £ fi follows immediately from (12-12) and (12-14), since 
the hypothesis means that a = ord( U) and fi = ord( V) with U < 0 V. 

(2 ) a £ (1 => a Cf f> and (3) a Cf fi ==» a C fi follow equally easily from 
(12-12) and (12-14) and we will skip them. 

(4) a C fi => a < 0 fi. The hypothesis gives us an injection from a to fi 
(the identity!), and so a < 0 fi by Corollary 7.32; but a = 0 fi implies a = fi 
by Exercise 12.6 which contradicts the assumed, proper inclusion a C fi, and 
so a < 0 fi. H 

It is traditional to use for the ordering on ordinals the simplest notation, 
a < fi 4=>- df a < 0 fi (a, fie ON). (12-15) 

keeping in mind its equivalent characterizations in Lemma 12.14. We sum- 
marize its properties in one, now simple result. 

12.15. Theorem (The ordering of ON). (1) The class ON of ordinal numbers is 
well ordered by the condition a < fi. in the following precise sense: 

a < a, a < fi & fi < y => a < y, a < fi & fi < a =4> a = ft, 
a</iWa = /iW/i<a, 
and for every definite condition P. 

(3a e ON)P(a) => (3a e ON)[P(a) & f/fi < a)^P(y?)]. 

In particular, there is no infinite descending chain of ordinals, 

«o > ol\ > ai > • • • (3«)[a„ = a„+i], (12-16) 

When P(a) holds for some a, we set 

{pa e ON)P(a) = min {a e ON | P{a)}. (12-17) 

(2) For each ordinal number there is a next one, 

5(a) = df ipfi G ON) [a < fi] = a U {a}. (12-18) 

(3) Each set A of ordinal numbers has a least upper bound, 

sup A =df {pfi G ON)(Va G A)[a < fi] ={JA, (12-19) 

which is the maximum of A {if A has a maximum) and 0 if A = 0. 

Proof is left for the Problems, xl2.1 - xl2.3. 3 

12.16. Exercise. For each non-empty set of ordinals W , 

{pa G ON)[a G %] = ffi'S . 
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The successor ordinals are those of the form S (a) and the limit ordinals are 
those which are not successors or 0, so that a < X=> S{a) < X; these are 
also characterized by the property 

Limit(A) •<=>■ X 0&X = sup{a | a < 2}. (12-20) 

We can prove properties of ordinals by transfinite induction and define 
operations on them by transfinite recursion, as follows. 

12.17. Theorem (Ordinal induction). For every unary definite condition P, 

(Va)[(Vf <a)P(0=^(«)] =► (Vo)P(a). 

Proof. Towards a contradiction, let a be least such that -<P(a); now P( £) 
holds for all £ < a, and so P(a) holds by the hypothesis, which we assumed 
it does not. H 

12.18. Theorem (Ordinal recursion). For every binary definite operation FI, 
there exists a unary definite operation F , which satisfies the identity 

F{a) = H(F f a, a) (a G ON). (12-21) 

Here F (a is the function {(£, F{£)) (6 a} obtained by restricting F(£) to 
a = {£ | £ < a}. 

Similarly, with parameters, given H(w,a,x), there exists F(a,x) such that 
for all a, x, 

F(a,x) = H({(£,F(£,x)) \£<a},a,x) (a G ON). 

Proof of the simpler, parameter-free version. 

For each [1. by Corollary 11.6 on the well ordered set (fi, </;), there exists 
exactly one function / p : fi — » Ep which satisfies the identity 

fp{ot) = H{f/j\{x e fi \ x <ft a}, a) ( a < fi), 

= H (f /j \ a, a) , (12-22) 

using the fact that <p coincides with G and the members of [1 are ordinals. 
We claim that 

if a < P and a < y, then / fia) = f y (a)\ 

if not. then there would exist a least a for which this fails for some P and y, 
and then (12-22) yields a contradiction immediately. Thus, we can set 

F(u) = fs(a)(o 

so F(a) = f /j (a) for any p > a and (12-22) implies the required identity for 
the operation F. 

The proof for the version with parameters is only notationally more com- 
plex. H 

Using this theorem, we can define arithmetical operations on ON and study 
their structure. We will leave most of this for the problems, but it is worth 
recording here the two most basic definitions, as examples of Theorem 12.18, 
and in order to have some notation available to name specific ordinals. 
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12.19. Theorem (Ordinal addition and multiplication). There exist binary , def- 
inite operations a + /? and a ■ ft on the ordinals which satisfy the following 
identities'. 

a + 0 = a, 

a + S{p) = S(a + p), (12-23) 

a + A = sup {a + /? | /? < X), if Limit(A). 
a • 0 = 0. 

a • S(p) = (a • P) + a, (12-24) 

a ■ A = sup {a ■ P \ P < A}, if Limit)!). 

Proof. We set a + p = F{p, a), where F(p, a) is defined by the following 
recursion on p e ON, with a as parameter: 

(a, if /? = 0. 

F(p. a) = < S{F(y, a)), if p = S{y), for some y, 

[sup{.F((i;, a) \ £ < P}. if Limit)/?). 

We leave for Problem xl2.6 the (similar) argument for multiplication. H 
We have already introduced the ordinal co in (12-7), and proved in Exer- 
cise 12.8 that co = ord(N, <n). The ordinals following it immediately are 
obviously 

co -f 1 = S (co) , co + 2 = S (co — L 1 ) , co-t-3 = *S (co -t- 2) , . . . 
and right above these comes 

co + co = sup {co + n | 7i € co} = co • 2. (12-25) 

This is the second limit ordinal, the first one above co. Each co • n can be 
obtained by adding co to itself n times, directly from the definition. Next 
comes 

co 2 = sup {co ■ n\n < co}, 
after a while co 3 = co 2 ■ co, etc. 

Many of the properties of ordinal addition and multiplication are most 
easily derived from the properties of these operations on (well ordered) sets, 
as we defined them in 7.37 and 7.38, using the following three exercises. 

12.20. Exercise. For every ordinal a, 

a + 1 = ord(Succ(a)), 

by the definition of the successor poset Succ(P) in 7.16. 

12.21. Exercise. For cdl ordinals a, /?, 

a + P — ord(a + 0 /?), 

by the definition of addition of posets in 7.37. It follows that addition of ordinals 
is associative but not commutative, and that 


(12-26) 
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12.22. Exercise. For all ordinals a, ft. 

a ■ ft = ord(c»! ■„ ft), 

by the definition of multiplication of posets in 7.38. It follows that multiplication 
of ordinals is associative but not commutative, and that 

0 < a & ft < y => a ■ ft < a ■ y. ( 12 - 27 ) 

On the other hand, some of the properties of ordinal arithmetic are more 
easily shown by ordinal induction, directly from their recursive definitions: 

12.23. Exercise (Distribution of • over + to the right). For cdl ordinals a, //. y, 

a • (ft + y) = a ■ ft + a ■ y. 

The solutions of these exercises suggest methods for establishing several 
additional results of ordinal arithmetic, which we will leave for the problems. 
Note that in addition to commutativity, many other properties of natural 
number arithmetic fail for ordinals, including the distribution of • over + to 
the left. Problem xl2.13. 

12.24. Limits of ordinal sequences. If «o < or < • • • is a non-decreasing 
sequence of ordinal numbers, we will write 

lim„ a„ = sup{a„ | n = 0, 1, . . . }. 

The notation is useful, but we must be careful when we use it because these 
limits do not satisfy the usual “limit theorems” of calculus, cf. Problem xl2.9. 

Next we describe von Neumann’s elegant solution of the problem of cardinal 
assignment 4.20 for well orderable sets, which is based on the fact that every 
one of them is equinumerous with an ordinal. Lemma 12.9. 

12.25. Definition (Von Neumann cardinals). We set 

j G ON)[^4 = c f\, if A is well orderable, . , 

\A\ - < (12-2 8j 

Id, otherwise, 

and we assume from now on that the cardinal assignment we fixed back in 4.21 
of Chapter 3 is, in fact, this one. The values of |^4| for well orderable A are the 

von Neumann cardinals. 

Card t ,(«) *£=> for some well orderable A, k = \A\. ( 12 - 29 ) 

and they are easy to characterize as the initial ordinals: 

12.26. Exercise. Card„(re) ON(k)&(Vq! < k)[k f c a], and for every 

k € Card„, |k| = k. 

By Lemma 9.11, 

Card„(«;) Card t ,(«; + ), 

and by Lemmas 9.18 and 9.20, if W is a non-empty set of cardinals, then 

(V/c G g’)Card„(/«) Card t ,(inf c .(g’)) and Card„(sup c .(g’)); 
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in fact immediately from (12-19) and 12.16, if W is a non-empty set of von 
Neumann cardinals, then 

inf c (£f) = min(g’) = sup f (e?) = supW = IJg’. 

The von Neumann construction provides a strong cardinal assignment on 
the class of well orderable sets, in the sense of 4.21: 

12.27. Exercise. If A and B are well orderable , then 

A =, \A\, and A = c B <£=> \A\ = |5|. (12-30) 

Moreover, for each set W of sets, the class {\X\ \ X £ 8 ’} is a set. 

The most useful property of von Neumann cardinals is the simplest: 
if k, A £ Card tI . then u = c X A=A- k = X, 

this follows immediately from the last two exercises and transforms all the 
equinumerosities between von Neumann cardinals into equations. Finally, 
for well orderable cardinals, we can write 

k ■ (2 + /<) = k ■ X T k ■ ju, , 

etc., without the annoying subscript c . 

12.28. Cardinals, Choice and Replacement. One can make a good case that 
Cantor’s units in the intuitive description of cardinals quoted in 4.19 are 
modeled faithfully by the von Neumann ordinals, and the quotation 

A grows, so to speak, out of A in such a way that from every element 
x of A a special unit of A arises 

describes precisely the construction of \A\ = ord(A) = v[A] relative to some 
best wellordering of A, Problem xl2.28. Whatever the value of the imagery, 
von Neumann’s construction is certainly very useful, if only because it pro- 
duces a genuine cardinal arithmetic from the calculus of equinumerosities that 
we have developed, at least when we assume the Axiom of Choice, so that all 
sets have von Neumann cardinals. 

About AC, we have been careful to state results about cardinality without 
assuming it. whenever this was possible, but of course the main effect has 
been to make clear just how poor cardinal arithmetic is without it. The main 
problem is the equivalence of cardinal comparability with AC: we do not have 
much of an arithmetic if we cannot compare numbers for size, and we cannot 
assume comparability without (necessarily) conceding the truth of the full 
Axiom of Choice. 

Granting AC, how important is the existence of “true cardinals” which 
satisfy (12-30) and whose construction requires not only AC but also the 
Axiom of Replacement? Not much, by any account, unless you are allergic to 
subscripts. Thus, it might appear that von Neumann’s solution of the problem 
of Cardinal Assignment is primarily an exercise in mathematical elegance. 
There is some truth to this, but one must not draw the further conclusion that 
the Axiom of Replacement is unimportant for cardinal arithmetic, just because 
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its basic identities can be established in ZDC+AC, as equinumerosities. The 
problem is that ZDC+AC cannot prove the existence of any cardinals above 
the first infinite sequence 

Ho, Hi, fb, .... 

and in fact it cannot even show that the sequence (n i— > N„) exists. Problem 
xB.10. In particular, the existence of singular cardinals cannot be shown in 
ZDC+AC, so that the whole theory of cofinality remains possibly vacuous 
without the Axiom of Replacement. 

The upshot is that to have a decent cardinal arithmetic, you must assume 
both the Axioms of full Choice and Replacement, i.e. , to work in a theory 
at least as strong as ZFDC+AC. It is sometimes claimed that the Principle 
of Foundation is also necessary for cardinal arithmetic, but this is not true — 
although some of the most important applications of cardinals are to the 
structure of von Neumann’s universe V of pure, grounded sets. 

12.29. Proposition (The alephs). By recursion on a € ON, we set 

Ho = \N\ = CO, 

Hy?+i = H+, (12-31) 

Ha = supjH^ | p < 2}. if LimitU). 

Each H q is a von Neumann cardinal , 

a < p => K q < c Vi p (a. P € ON), 
and every infinite von Neumann cardinal is Via for some a. 

Proof. To check first that a i— > Vi a is strictly increasing in cardinality, fix a 
and (towards a contradiction) choose P least such that a < p and H Q = f Vi p : 
now P f a + 1, since H Q < c K+ = H a+ i; p = y + 1 for some y > a implies 
that Via <c H y < c Vi p (by the choice of P), which is a contradiction; and if P is 
limit, then a+1 < p, andsoH tt < c K a+ i < c which is also a contradiction. 

It follows that each is a von Neumann cardinal: this is immediate for 0 
and successor ordinals, and for limit 2, if H^ is not a cardinal, then < c H^ 
for some P < 2, which contradicts H^ <+ 

Finally, to show that every von Neumann cardinal is an aleph, let (towards 
a contradiction) n be the least counterexample. Directly from the definition, 
k f co and n 2 + for any ordinal 2 < re, since that implies k = |2| + , |2| is 
an aleph by the choice of k, and then n is also an aleph. Let 

P = {a < « | H„ < c re}. 

This is an ordinal, since it is closed under <; it is a limit ordinal, because 
H „ < re =+> H tt+ i < re; and by its definition, 

a < P =+ Hy? < re. 

On the other hand, if 2 is a cardinal and 2 < re, then 2 = H Q for some a < p, 
by the choice of re, so that finally 

re = sup{N Q | a < P} = H/j. H 
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Figure 12.3. Logarithmic rendition of the pure, grounded sets. 

The operation anH a supplies a useful notation for cardinal arithmetic, 
especially when we accept the Axiom of Choice. 

12.30. Exercise. The Axiom of Choice AC is equivalent to the proposition that 
“every infinite cardinal is an aleph ”, 

AC ^ (V infinite A) (3a G ON) [,4 = c \A\ = H Q ]. 

12.31. Exercise. (AC) The Generalized Continuum Hypothesis is equivalent to 
the cardinal identity 

GCH <$=► 2*“ = H a+1 (a G ON). 

12.32. The Cumulative Hierarchy of Pure, Grounded Sets. For each ordinal a 
we define the set V a by the following recursion on ON: 

V 0 = 0. 

V«+i =V{V a ), 

Vx = U a <;V„, if Limit (A). 

The von Neumann universe is the union of all the V«’s 

V = d f U aeON^o = i x I for some a e ON ’ x e (12-32) 

and on it we define the rank operation by 

Rank(x) = (jua G ON)[x G V Q+ i] (x G V). (12-33) 

We have already used the symbol V to denote the class of pure grounded 
sets, because of the next result. 

12.33. Theorem. (1) Each V a is a pure, transitive, grounded set, and 

a<f=>V a C Vp. 

(2) If X is a limit ordinal, X > m ■ 2. then Vx is a Zermelo universe. 
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(3) For each pure set A, A C V ==> A e V. 

(4) The von Neumann universe V comprises the pure, grounded sets. 

Proof. The arguments for Parts (1) and (2) are those we used to prove the 
corresponding properties of the basic closure sets M{I) with transitive / in 
Lemma 11.15, and we will not repeat them. 

(3) Assume that A C V and let Rank[A] = {Rank(x) \ x G A} be the 
image of the rank operation on A. This is a set of ordinals, so there exists 
some ordinal k strictly above its members. 

x € A Rank(x) < k => x e V K , 
and (using the purity of A), 

A = {x G V K | x G A} G V*+i. 

(4) We use the contrapositive of (3), which says that for pure A, 

A i V => (3x g A)[x <t V]. (12-34) 

Suppose first that M is pure, transitive and grounded but M (f-_ V. The 
transitivity of M and (12-34) yield 

(Vx G M \ V)(3v G M \ V)[y G x], 

and then DC gives us a descending e -chain which proves M ill founded, 
contradicting the hypothesis. Thus, every pure, transitive, grounded set is in 
V, so for every pure, grounded set A, TC(A) G V, and then A G V. since 
A gTC{ A) and V is transitive. H 

Recall that from Theorem 11.34, V is a ZFDC-universe, in fact a ZFC- 
universe if we assume AC. 

12.34. The naive notion of pure, grounded set. In discussing the Principle of 
Foundation in 11.32, we argued that the definition of the least Zermelo uni- 
verse Z and the proofs of its basic properties can be understood directly and 
naively, as we usually understand mathematics, and that they carry consid- 
erable force of persuasion as an intuitive conception of “set” which justifies 
the axioms of ZDC+AC and the Principle of Foundation. In the same vein, 
we can argue that the “construction” of von Neumann’s universe V in 12.32 
and Theorem 12.33 can be understood directly and naively outside the details 
of any specific axiomatization, and that it puts forward a natural, intuitive 
conception of “pure, grounded set” which justifies the axioms of ZFC. It is 
worth looking into these arguments a little closer. 

The construction of Z begins with the infinite set 

No = {0,{0}.{{0}}....} 

and iterates the powerset operation V{A) infinitely many times. Since the 
“infinity” of iterations involved is no more and no less than that embodied by 
No, we can say that to understand Z we must understand two infinitary things: 
the set No (basically the natural numbers) and the powerset operation. 



Chapter 12. Ordinal numbers 


189 


The construction of V starts (literally) with nothing, the empty set, but it 
proceeds to iterate the powerset operation V{A) through all the ordinals. On 
the same analysis, it is fair to say that to understand V we must understand the 
class of ordinals ON and the powerset operation. One may attempt to speak 
eloquently about the ordinals and justify them, as one might try to justify the 
natural numbers or the powerset operation. It should be clear, however, that 
the ordinals represent a separate and different new ingredient in our intuitive 
understanding of V, they cannot be reduced to N 0 and the taking of powersets. 
From this point of view, the justification of the axioms of ZFC which we find 
in this intuitive construction is considerably weaker than the justification of 
ZDC+AC we get from contemplating Z. 

In Appendix B we will consider alternative set universes, including some 
which contain both atoms and ill founded sets, and in more advanced text- 
books one can find a multitude of fascinating models of set theory constructed 
(primarily) by extensions and combinations of Godel’s constructibility and 
Cohen’s forcing. Part of the reason we have worked here in the weak systems 
of ZDC and ZFDC is to ensure that the elementary results of axiomatic set 
theory we have established apply directly to (essentially) all these models. 
These models, however, are all constructed starting with some given model of 
ZFC, and it is not clear how to produce for any of them independent, intuitive 
notions of what sets are, which justify directly the axioms they satisfy without 
also justifying the axioms of ZFC. 

It appears that (as of now), the intuitive conception of pure, grounded set, 
which is gleaned from an informal analysis of 12.32 and 12.33, is by far the best 
replacement we have for Cantor’s unfettered (and self-contradictory) notion 
of “collection into a whole of definite and separate objects”. 

12.35. About atoms and applications. In 3.25 we argued (with Zermelo) that 
it is useful to allow atoms in axiomatic set theory, so that our theorems apply 
directly to sets of planets or frogs, as well as to the pure, grounded sets made 
up out of nothing. Can the universally accepted, standard theory ZFC which 
does not allow atoms justify the applications of set theory? There are two 
good responses to this. 

First, we can model physical objects and relations among them by structures 
made up of pure sets, much as we model ( up to isomorphism) ordered pairs, 
functions, the natural numbers, etc. For example, to study the behavior of 

a system of heavenly bodies P\ Pk interacting with each other under 

gravity, we might represent each of them by a function P t : R. — > M 7 , which 
assigns to each real number t G R the mass, position and velocity of P,. 
relative to some fixed coordinate system and units. The laws of gravity and 
motion will then determine these functions; and the physicist does not care 
whether the set theoretic objects which model the functions Pi,... . Pk are 
pure or not — what matters are the relations among these functions, which can 
then be interpreted as relations among the planets and checked against reality 
by observation. 
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Figure 12.4. Rendition of the grounded sets over K. 

Second, if we value the ability to talk directly about planets within set 
theory, we can allow a class of atoms K and replace V by the class V[K] of 
grounded sets supported by K defined in 11.38. This is a ZFDC-universe by 
Problem xll.22; it satisfies the Axiom of Choice;, and, in the interesting case, 
when A is a set, looks very much like and has essentially all the properties 
of V, and by the same proofs, cf. Problem xl2.32. It can be constructed in 
ZFDC by the ordinal recursion 

v 0 [fq = K. 
v a+ i [K] = r(v a [K]), 

VAK] = {j a<i y a [Kl if Limit (A), 

V[K] = {J a V a [K]. 

The main point is that the physicist does not care about the difference 
between these two approaches, and probably cannot even see it: because 
all that matters for the application of mathematics (in the example) are the 
functions P, which codify the properties of the planets, just as all that matters 
about the natural numbers is that they form a Peano system — what they 
actually are is of no consequence. And so, ultimately, it is most useful to 
“ban” atoms and accept the simpler ZFC as the “standard” axiomatic set 
theory, which is what is done without exception in all advanced work on our 
topic. 


Problems for Chapter 12 

xl2.1. Prove (1) of Theorem 12.15. 
xl2.2. Prove (2) of Theorem 12.15. 
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xl2.3. Prove (3) of Theorem 12.15. 

^ x 1 2.4. Characterization of von Neumann ordinals. Suppose <f>( V) is a definite 
operation which assigns well ordered sets to well ordered sets and which 
satisfies the following three conditions: 

V= 0 <t>(v), 

U< 0 V=>cj>(U)Qci>(V), 

Field(0(K)) = {Field(0( t/)) | U < 0 V}. 

Prove that ord(F) = a => 4>{V) = (a, <„). 

xl2.5. The class ON is not a set. 

xl2.6. Justify the definition of ordinal multiplication in Theorem 12.19. 
xl2.7. For all ordinals a, /?, y,3: 

0 + a = a, and co < a => 1 + a = a, 

0 < p => a < a + p, 
a < [5 &y <6 => a + y < ft +S, 
a < P &y < S => a + y < P + d. 

Show also that, in general, 

a < p does not imply a + y < P + y. 

xl2.8. If a 0 < a\ < ■ ■ ■ is a strictly increasing sequence of ordinals, then the 
limit lim„ a„ is a limit ordinal. 

xl2.9. Give examples of strictly increasing sequences of ordinals such that 
1 i m n ( T /l) 7 ^ lim« o/. n T P- 
lim„(a„ + P„) ± lim„ a„ + lim„ P„. 
xl2.10. For all ordinals a. /?, y,S: 

0 • a = 0 

0 < a & 1 < P => a < a ■ P 
a < P &y <3 => a ■ y < P ■ 5 
0 < a < p &y < 3 => a ■ y < P ■ 8. 

Show also that even when y > 0. in general. 

a < P does not imply a ■ y < P ■ y. 
xl2.11 (Cancellation laws). For all ordinals a. p , y, 

a + P < a + y => P <y, 
a + P = a + y => P = y, 
a ■ P < a ■ y =$■ P < y, 

0 < a &a ■ P = a ■ y => P = y. 



192 


Notes on set theory 


Show also that, in general, 

0 < a&p ■ a = y ■ a does not imply ft = y. 
xl2.12. For all a > co and n < co: 

n + a = a, 

(a + \) ■ n = a ■ n + \ (n > 1), 

(a + 1) • co = a ■ co. 

xl2.13. Give an example of three ordinals a, [i. y for which 
{a + P) ■ y ^ a ■ y + P ■ y. 

Hint. Use the preceding problem. 
xl2.14. If n < m . then co" + co'" = co'". 

xl2.15 (Ordinal subtraction). If a < y, then there exists exactly one p such 
that y = a + p. 

xl2.16. Every ordinal a < co 2 can be written uniquely in the form 

a = co ■ x + y (x, y < co). 

*xl2.17. For any ordinal a < co N , there are unique n < N,x < co, P < co " 
such that a = co n ■ x + p. Hint. Choose n largest such that co" < a, and v 
largest such that co" • x < a. 

xl2.18. For any N > 0, every ordinal a < co N can be written uniquely in the 
form 

a = co" 1 • xi + co" 2 ■ X 2 + h co" 1 • x s + x i+ i, (12-35) 

where N > n\ > n 2 > • ■ ■ > n s and vq, . . . , x s+ \ < co. Hint. Use induction 
on N and the preceding problem. 

12.36. Definition (Normal operations). A unary, definite operation F on the 
ordinals is normal if it is strictly increasing 

a < p=>F(a) < F{p), 

and continuous at limit ordinals, i.e., 

F(A) = sup {F(P) | p < 2}. if Limit (A). 

xl2.19. If F : ON — > ON is a normal operation, then for every a, a < F(a), 
and for every limit ordinal 2, F(X) is a limit ordinal. 

xl2.20. For any fixed a . the operations 

S a (p)=a + p. P a (P) = a-p 

of “adding to” and “multiplying on the left by” a are normal. 

xl2.21. The composition F (a) = G(H(a)) of normal operation is normal. 
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xl2.22. Define ordinal exponentiation a P (for a > 1), so that 


a s{ ^ = ■ a, 

a x = sup{c/ | ft < X}, if Limit (A), 
and prove the following: 

(1) If ft < y, then a P < a y . 

(2) For each a > 1, the operation E a {ft) = a P is normal. 

(3) a [fs+y) = a? ■ a y . 

(4) ( a^Y = aP y . 

Hint. Exercises xl2.20 and xl2.21 and (2) simplify considerably the proofs 
of (3) and (4). 

*xl2.23. If a > 0. then there is a largest ft such that a/ < a, and for that ft, 
a = co^ 4- y with some y < a. Hint. Use Problem xl2.15 to get the y, and 
prove that y < a/ +1 , which contradicts y > a. 

*xl2.24. (1) If ft < y, then cc/ + co y = co y . 

(2) Every ordinal a > 0 can be written uniquely in the form a = a>P + y 
with y < a. 

x!2.25 (Cantor normal form). Every ordinal a > 0 can be written uniquely 
in the form of a finite sum of non-increasing powers of a>, 

a = to* + o* + • • • + co A (fti > ft 2 > ••• > ft s ), (12-36) 

or, equivalently. 

a = co 111 ■ n\ + g/ 2 • • • «2 + • + ft/' • n, 

(fti>ft 2 >---> ftt, n t < OJ, m ± 0). (12-37) 
Hint. For the uniqueness, prove first by induction on s that 

if ft > ftt > fti > • • • > fts and y = co ^ + a / 2 + • • • + co & , 

then y < co^ + y. 

If £o is the least fixed point of the normal operation ( a i— > co a ), then its 
Cantor canonical form is the useless 

£0 = ft / 0 = CO ™' 0 = • • • 

But the ordinals less than e 0 have non-trivial Cantor canonical forms which 
provide simple and (sometimes) useful representations of them. 

x!2.26. Find the Cantor canonical form of co ■ {co m + 1) + (ft/ J + 1) • co. 


*xl2.27. The only ordinals which belong to the least Zermelo universe Z are 
the finite ones. 
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xl2.28. If < is a best wellordering of A, then \A\ = vu[A], i.e., \A\ is the 
ordinal assigned to the well ordered set (A, <) by its von Neumann map. 

xl2.29. The class Card t , of von Neumann cardinal numbers is not a set. 

xl2.30. (AC) The definite operation 3„ (read “Beth-alpha”) is defined by 
the following recursion on ON : 

3 0 = N 0 = |N| = co, 

3/m = 2 ^, (12-38) 

3a = sup {3^ | ft < 2}. if Limit (A). 

Prove that for every ordinal a, 

| — 3q; . 

xl2.31. For every ordinal a, Rank(a) = a. 

xl2.32. For each set of atoms K, the rank hierarchy ( a i— > V a [A]) has the 
following properties. 

(1) Each V a [K] is a transitive, grounded set, supported by K, and 

a<P=*V a [K]QVit[K\. 

(2) If 2 is a limit ordinal, 2 > co ■ 2, then V, [A'] is a Zermelo universe. 

(3) For each set A supported by K, A C V[A] => A e V[K}. 

(4) The universe V[K] comprises the grounded sets supported by K. 

xl2.33. For each set of atoms K and each ordinal a, 

V Q = V a [K ] n [x | x is pure}. 

*xl2.34. Suppose n : K K is a permutation of a set of atoms K. Prove that 
there is a unique extension n* : V[A'] >-» V[A'J of n. which is an automorphism 
ofV[A], i.e., 

x€y <=3- 7i* (x) € 7i* (y) {x,y &V[K]). 

12.37. Definition (Closed unbounded classes). A class of ordinals M C ON 

is unbounded if 

(V£ G ON)(3a G ON)[c < a&a G M]; 
and it is closed if for every set of ordinals A ^ 0, 

ACM =£> sup A G M. 

12.38. Exercise. If M is a closed class of ordinals and 

olq < ot\ < • • • (a„ G M) 

is an increasing sequence of ordinals in M , then lim„ a n G M . 

*xl2.35. If M\ and M 2 are closed, unbounded classes of ordinals, then then- 
intersection M\ n Mi is closed and unbounded. 
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*xl2.36. Every normal operation on the ordinals has a fixed point, i.e. , for 
some ordinal, a, F(a) = a ; in fact, the class 

fp(M) = {a G ON | F(a) = a } 

of fixed points of F is closed and unbounded. Hint. To get one fixed point, 
let a 0 = 0, a n+ \ = F{a„), and show that F( lim„ a n ) = lim„ a n . 

xl2.37. (1) There exists a von Neumann cardinal k, such that 

k = 

(2) (AC) There exists a von Neumann cardinal A, such that 

A = 3*. 

12.39. Definition. Suppose a < /? are infinite limit ordinals. A function 
/ : ol — * /I is cofinal if 

sup {/(£) | £ < a} = p. 

The identity (cj i — s- if) is a cofinal function on every limit ordinal, for exam- 
ple. but (n i— > H„) is also cofinal, from co to H 0J . 

The next problem establishes a useful characterization of the cofinality 
operation, defined in 9.23, which is often taken as its definition. 

xl2.38. For each von Neumann cardinal k, 

cf(ft) = min{a | there exists a cofinal / : a — > n}. 

xl2.39. Prove that for all von Neumann cardinals /. < k. there exists a cofinal 
function / : X — > k if and only if cf(A) = cf(/e). 

xl2.40. For every regular /, cf(H;) = A, so there exists cardinals of every 
regular cofinality. 

*xl2.41. There exist singular cardinals of every regular cofinality. 

xl2.42. For every von Neumann cardinal A with cf(A) > Hi. the set V, is 
a Zermelo universe which further satisfies the following special case of the 
Replacement Axiom: if F is a definite operation and x € V, =>■ Fix) G V;., 
then the image F[A] of every countable A G Va is also in Vu 

12.40. Definition. (AC) An uncountable cardinal number k is strongly inac- 
cessible if it is regular and 

A < K => 2 A < K. 

*xl2.43. (AC) If k is strongly inaccessible, then V K is a ZFC-universe. 

*xl2.44. (AC) Ifa pure and grounded set Mis a ZFDC-universe, then M = V K , 
for a strongly inaccessible cardinal n. 
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12.41. Frege cardinals. We have followed Cantor in his approach to the theory 
of cardinal numbers, by which the property 

A= C \A | (12-39) 

is most fundamental. There is another approach due to Frege, which takes 
\A\ to be not a set of “units” equinumerous with A, but the abstract notion 
of “being equinumerous with A”. Frege understands “1”. for example, as the 
common property of all singletons. To model this idea in set theory, it is not 
important to define |^4| so that it is equinumerous with A, in fact, it is not even 
necessary for \A\ to be a set! The only important property of cardinals is the 
last one, 

A= e B <*=► \A\ = \B\, (12-40) 

which (in effect) makes the operation \A\ a “determining surjection” of the 
“equivalence condition” = c , with the cardinal numbers as the “quotient class”, 
in the natural extension to classes of the terminology in x4.5. Frege tried to 
capture this idea by setting 

\A\ = {X\X = c A}, (12-41) 

but the class {X \ X = c A} is not a set (when A ^ 0, easily) and the (necessary 
for the theory) assumption that it is led Frege to a contradiction. 

Von Neumann cardinals reconcile the Cantor and Frege approaches by 
satisfying both (12-39) and (12-40), but their definition depends on both 
the Axioms of Choice and Replacement. Problem xl2.46 describes another 
approach, due to Scott, which succeeds in defining Frege cardinals without 
the Axiom of Choice, but (essentially) only for pure, grounded sets. Scott’s 
construction is important, not so much for rescuing Frege cardinals (since little 
cardinal arithmetic can be done without AC anyway), but for the simplicity 
and elegance of the method, which has many uses beyond the present one. 
First we describe Scott’s general method, and then its application to Frege 
cardinals. 

12.42. Definition. An equivalence condition on a class A is any binary, definite 
condition ~ which has the properties of an equivalence relation, i.e., for all 
objects x, y,z G A 

x ~ x, x ~ y => y ~ x, x ~ y & y ~ z => x ~ z. 

A unary, definite operation F is determining for ~ if 

x~y •<=> F(x) = F(y) ( x, y € A ); 

the class of values of F for arguments in A is the quotient (class) of A by ~ 
determined by F . 

F[A\ = df {F(.\-) | x G A}. 

For example, the condition =,, of similarity is an equivalence condition on the 
class of well ordered sets, and the von Neumann ordinal assignment ord({7) 
is a determining operation for it, with quotient the class ON of ordinals. The 
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condition = c of equinumerosity is an equivalence condition on the class of well 
orderable sets, and the von Neumann cardinal \A\ operation is determining 
for it, with quotient the class of von Neumann cardinals. 

We can think of an equivalence condition ~ on a class A very much as if 
A were a set and ~ C A x A an ordinary equivalence relation on it. There 
is no easy way to define a determining operation for ~, however, because the 
classical construction of equivalence classes 4.12 truly leads to “classes” which 
need not be sets in this case: this is the problem with Frege’s definition of the 
number 1 above. 

xl2.45. (Scott) Suppose ~ is an equivalence condition on a class A of pure, 
grounded sets, and for each x € A let 

p(x) =df {pa G ON)(3y G V a )[y ~ x], 

F{x) =df {y G V p(x) | y ~ x}. 

Prove that F is a determining operation for ~ on A. 

xl2.46. (Scott) Define the Scott cardinal \ A\ S of every set A which is equinu- 
merous with a pure, grounded set, so that for all such sets A and B, 

A= C B *=* \A\ S = |5|,.. 
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APPENDIX A 


THE REAL NUMBERS 


In this Appendix we will show how the rational and the real numbers can 
be represented faithfully in set theory as the natural numbers are; that is, we 
will identify some characteristic, set theoretic properties of these systems and 
we will prove from the axioms of ZDC the existence and uniqueness (up to 
isomorphism) of structured sets with these properties. The proofs are quite 
simple as far as set theory goes, but they use ideas from algebra and analysis, 
which we will present in outline. 

There are two. standard representations of the real numbers in set theory, 
Cantor’s equivalence classes of Cauchy sequences and Dedekind’s Dedekind 
cuts. Here we will employ a somewhat novel construction which interweaves 
between these two and (incidentally) establishes their equivalence. The basic 
notion that we need is that of a quotient of a set A by an equivalence relation 
on A. described in Problem x4.5. 

Recall that a determining surjection for an equivalence relation ~ on a set A 
is any surjection 

n : A — » B 

such that for all x,y £ A, 

x ~ y -4 = => n(x) = n{y). 

When this holds, we call B a quotient of A by The canonical surjection of 
~ is the mapping 

X [x/~] (x £ A) 

with quotient the set of equivalence classes but there may be others, 

more illuminating of the situation, e.g., those described in Problems x4.6, 
x4.7 and x4.8. Determining surjections are especially useful in the study of 
congruence relations. 

A.l. Definition. Suppose ~ is an equivalence relation on A and 

f : A x A —> A 


The material in this Appendix can be read after Chapter 10 . It assumes some theoretical 
knowledge of the Calculus, and it is not a prerequisite for understanding the remainder of these 
Notes. 
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Figure A.l. Congruence for /. 

is s binary function. We call ~ a congruence for / if for all x, x' . y, y' G A, 
x ~ x'&y ~ y' =>/(x,y) ~ f{x',y'). 

Similarly, ~ is a congruence for a binary relation P C Ax A, if for all x, x' , y, y' , 

x ~ x' &y ~ y' [xPy x'P}']. 

We can obviously define the notion of congruence for functions and rela- 
tions of any number of arguments, in the same way. 

The next theorem deals with one of the simplest and most basic algebraic 
constructions. 

A.2. Theorem. Let n : A — » B be a determining surjection of some equivalence 
relation ~, so that for all x, y in A, x ~ y nix) = n(y). 

(1) If ~ is a congruence for a function f : A x A — > A, then there exists 
exactly one function f n :BxB—>Bon the quotient B such that the diagram 
in Figure A.l commutes, i.e., 

f M (n(x),n(y)) = n(f(x,y)) (x,y G A). (A-l) 

(2) is a congruence for a relation P C A x A, then there exists exactly 
one relation P n C B x B on the quotient B such that 

n{x)P K n{y) xPy (x, y G A). 

Proof. The form of (A-l) makes it clear that at most one function can 
satisfy it. so it is enough to show that at least one function does. Put 

f n =df {{(n{x),n{y)),n{z)) \ x,y,z G A&f(x,y) = zj. 

To verify that the set of pairs f n is a function, we must check that 

((w, v), w), ((w, v), w') G f n =^w = w'. (A-2) 

From the hypothesis of (A-2) and the definition of f n , there exist x, y,z G A 
such that 

u = 7i(x), v = n(y), w = n(z), f(x,y)=z 
and also x', y', z’ G A, so that 

u = 7 r(x'), v = n(y'), w' = n{z'), f{x',y') = z' . 
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It follows that 

n(x) = 7 i(x'), 7t GO = n(y'), 7 z(f(x,y)) = n(f(x',y')) 

since ~ is determined by n and it is a congruence for f , and the last of these 
equalities yields the desired w = w'. 

The characteristic property (A-l) of f n follows immediately from its defi- 
nition and the proof of Part (2) is similar. H 

A.3. Exercise. Prove Part (2) of A.2. 

The axiomatic characterizations of the rationals and the reals are based on 
the notion of an ordered field, which codifies the basic properties of addition, 
multiplication and ordering in these number systems. 

A.4. Definition. A field is a structured set (F, 0. 1,+,-) of objects with the 
following properties. 

(FI) 0, 1 € F , 0 f 1, and +, • are binary functions on F. 

(F2) The addition function + satisfies the identities 

1. (x + y) + z = x + (y + z), 

2. x + y = y + x, 

3. x + 0 = x, 

and for every x, there exists some x', such that x + x' = 0. 

(F3) The multiplication function • satisfies the identities 

1. (x • y) ■ z = x • {y • r), 

2. x ■ y = y ■ x, 

3. x • 1 = x, 

and for every x f 0, there exists some x" , such that x • x" = 1 . 

(F4) Addition and multiplication together satisfy the identity 

x • (j + z) = x • y + x • z. 

A.5. Lemma. Every field F has the following properties'. 

(1) For each x there exists exactly one x' such that x + x' = 0. and we denote 
it — x; for every x 0 there exists exactly one x" such that x ■ x" = 1 and we 
denote it by x~ l . 

(2) x • 0 = 0. 

(3) x ■ y = 0 => x = 0 V y = 0. 

(4) (-x) • y = -(x-y). 

Proof. (1) If x + x' = 0 and x + y = 0, then from the axioms 

y = y + 0 = 0 + y = (x + x') + y = x + (x' + y) 

= x + (y + x') = (x + y) + x' 

= 0 + x' = x' + 0 = x' . 


The proof about x 1 is similar. 
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(2) x • 0 = x • (0 + 0) = x ■ 0 + x ■ 0, and therefore 

0 = x • 0 H — (x -0) = (x • 0 + x • 0) H — (x • 0) 

= x • 0 + ((x • 0) H — (x • 0)) = x • 0 + 0 = x • 0. 

(3) If y 0, then some j -1 exists such that y ■ y~ l = 1, so that 

X = x\ =x- (y ■ y- 1 ) = (x • y) • y~ l = 0 • y~ l = y ~ 1 -0 = 0. 

(4) x • y + (— x) ■ y = y ■ x + y ■ (— x) = y ■ (x + (— x)) = y ■ 0 = 0, and 

(1) implies that (— x) • y = — (x • y). H 

We gave this proof in full as an example, justifying each step from the 
field axioms. In the future we will cut corners, skip details or (more often) 
use identities which obviously hold in every field without proof or explicit 
mention. 

A.6. Exercise. Every field F satisfies the identity 

(x + y) 2 = x 2 + 2xj + y 2 , 

where 2=1 + 1. (Give the proof in full detail.) 

A.7. Exercise. The doubleton {0, 1} of the first two natural numbers is a field, 
with the obvious operations, and in this field 1 + 1 = 0. It follows that the field 
axioms do not imply 1 + 1^0 and we must be a bit careful! 

A.8. Definition. An ordered field is a structured set 

(E0,1, +,-,<) 

where (F. 0, 1, +, •) is a field, the binary relation < is a linear ordering of F 
and the following conditions hold for all x, y,z e F: 

x < y =+ x + z < y + z, 
z > 0 & x < y => z ■ x < z ■ y, 

where z > 0 naturally abbreviates 0 < z & z 0. 

A.9. Exercise. In every ordered field, 

z > 0 & x < y =+ z • x < z • y. 

A.10. Lemma. Every element x in an ordered field F satisfies the inequality 
x ■ x = x 2 > 0, so that 0 < 1 and for all x, x > 0 => x + 1 >0. 

Proof. If x = 0. then x 2 = 0 > 0. and if x > 0, then x • x > x • 0 = 0, so 
that the only interesting case is when x < 0. Adding — x to both sides of this 
inequality, we get 0 < — x, so that we can multiply x < 0 by — x and we get 
(— x) • x < (— x) • 0, i.e., -(x 2 ) < 0 from the preceding Lemma, and adding 
x 2 to this inequality we get 0 < x 2 . The conclusion 0 < 1 follows because 
0^1 and 1 = l 2 , and the last claim holds because 0 < x => 1 < x + 1, so 
that 0 < x + 1 by the transitivity of <. H 

The Lemma makes it clear that we will not find in ordered fields the anomaly 
1 + 1 = 0 of Exercise A.7. Something much stronger is true. 
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A.ll. Lemma. Suppose F is an ordered field and set 

N F =f]{X C F | 0 G X & (Vx)[x G X => x + 1 6 X]}; 

it follows that (Nf, 0, (x i— ► x + 1)) is a Peano system. 

Proof. By (x x + 1) we mean the function S which associates with each 
x G Njr the element x + 1 of F, which is also a member of Nf by the definition. 
The first three axioms of Peano are obvious and the fourth (x + 1^0) holds 
because by the definition, 

Nf C {x G F | 0 < x}, 

and by the Lemma, x > 0=> x + 1 > 1 > 0. The Induction Axiom follows 
immediately from the definition of Nf as an intersection. H 

A.12. Exercise. Suppose F is an ordered field, N = Nf is the set of its natural 
numbers and +^, •«, <n the addition, multiplication and the wellordering o/N f 
as these are defined in Chapter 5. Prove that these functions and the relation < 
coincide with the respective objects in F , e.g., 

(Vx, y G N)[x + N y = x + y]. 

The members of Nf are the natural numbers of F. The (rational) integers 
of F comprise the natural numbers and their negations, 

Zf = Nf U {— x | x G Nf}, 

and it is easy to check that they are exactly all differences u — v, with u,v G Nf . 

The basic idea for the axiomatic characterization of the rationals is that 
they are an ordered field, and that every fraction is a quotient of integers, i.e. , 


m 

where m, u and v are natural numbers and m / 0. This simple observation 
yields not only the axioms for the rationals, but also proofs of their existence 
and uniqueness. 

A. 13. Definition. A system of rational numbers is any ordered field F which 
satisfies the condition 

(Vx G F)(3m, u, v G Nf)[w O&x = m~ l ■ (; u — u)]. 

A.14. Theorem (Uniqueness of the rational numbers). For any two systems of 
rationed numbers F\ , Ip there exists exactly one bijection 

n : F\ >-» F 2 

which is an isomorphism, i.e., 

1. Jl(0l) = 02,^(10 = 12. 

2. n(x +i y) = 7 r(x) +2 n{y), n{x -\ y) = 7 fix) - 2 7 i(y). 

3. x <1 y ^==> 7r(x) <2 7t(y). 
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In stating this theorem we decorated the various objects with the super- 
scripts 1 or 2 to clarify the field to which they belong, e.g., +i is addition in F\ 
and 0 2 is the zero element of F 2 . This is awkward and unnecessary, because 
it is always obvious which superscript is needed: e.g., the identity 7r(0) = 0 
cannot mean anything else but 7r(0i) = 0 2 , since n is a function with domain 
F\ and image F 2 . In the proof and in the future we will follow the general 
algebraic practice by which all the zero elements are 0, all additions are +, etc. 
We will also begin to skip the • of multiplication, 

xy =df ^ • y- 

Proof. By the uniqueness of the natural numbers and A. 11, we know that 
there exists a “canonical” isomorphism 

p : Ni >— » N 2 , 

where N] and N 2 are the sets of natural numbers in F\ and F 2 . respectively. 
We set 

k = {{m~ x {u — v), p{m)~ l {p{u) — p{v))) \ m, u, v € M i . /?? f 0}. 

so that 7 1 C Fi x F 2 and it is enough to show first that n is a function, then 
that it is a bijection, and finally that it is an isomorphism, as we defined this 
in the formulation of the theorem. 

To verify first that n is a function, we must show that if 

Wi(mi — vi) = mf‘(u 2 — u 2 ), (A-3) 

then 

p(m,)~ I (p(u 1 ) - p(v 0) = p(m 2 )~ l (p(u 2 ) - p{v 2 )). (A-4) 

The field axioms imply easily that (A-3) and (A-4) are respectively equivalent 
to 


m 2 u\ +m iv 2 = m\u 2 + m 2 v i, 
p{rn 2 )p{u x ) + p{mx)p{v 2 ) = p{mx)p{u 2 ) + p{m 2 )p{v t ), 
and the first of these yields immediately 

p[m 2 u\ + m ju 2 ) = p(mxu 2 + m 2 v i) 

which in turn implies the second, because p is an isomorphism of Ni with N 2 
and it respects addition and multiplication by Problem x5.4. 

The same simple method can be used directly to prove the additional con- 
clusions, that n is one-to-one and finally an isomorphism. H 

A.15. Exercise. Work out in detail the proofs of 

n{x + y) = 7i(x) + n{y), 
x < y 7i(.\-) < 7i(y)- 

A.16. Theorem (Existence of the rational numbers). There exists a system of 
rational numbers. 
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Proof. If we have the rationals. we can define the set 
A = {(/«. u, v) | m, u, v e N&m 0} 
of triples of natural numbers, and on it the relation 

(m. u, v) ~ [m' , u', v') *^==>-df m ' u + rnv’ = mu’ + m'v, 
which (quite obviously) satisfies 


( m , u, v) ~ (. m u', v') 


u — v 


u — v 


m m 

This means that ~ is an equivalence relation determined by the surjection 


U — y 

n : A — » Q, n(m,u,v) = . (A-5) 

m 

We do not have the rationals yet, but we have A and the idea for the proof 
is to define the rationals as a quotient of A by ~ so that (A-5) holds. First we 
must show that 

( 1 ) ~ is cm equivalence relation. As an example, we verify that ~ is transitive. 
From its definition, if 


(m\,u\,v\) ~ {m 2 , U 2 ,V 2 ) &(m 2 , U 2 ,V 2 ) ~ (m 2 ,u 2 ,v 2 ), 

then the identities 


m 2 u\ + ni\v 2 = ni\u 2 + m 2 v\, 
m 2 u 2 + m 2 v 2 = m 2 u 2 + m 2 v 2 

hold in the natural numbers, and if we multiply the first of these by m 2 and 
the second by m\ and then add them, we get 

m 2 m 2 u\ + m 2 m\v 2 + m\m 2 u 2 + m\ m 2 v 2 

= m 2 m\u 2 + m 2 m 2 v\ + m\m 2 u 2 + m\m 2 v 2 . 

Subtract now m 2 ni\v 2 and m\m 2 u 2 from the two sides and divide by m 2 , which 
gives 

>n 2 u\ + m i U3 = m\ u 2 + m 2 v\, 

i.e., (m\,u\,vi) ~ (m 2 ,u 2 ,v-}). Reflexivity and symmetry are proved in the 
same way. 

(2) Definition of the rationals. Since ~ is an equivalence relation, there exists 
a surjection 

n : A — » Q 

onto some set Q which determines it, so that 

(mi, «i, U[) ~ [m 2 , u 2 , v 2 ) <==>■ = n(m 2 ,u 2 ,v 2 ). 

This Q is the set of rationals in the system under construction, and it remains 
only to specify 0 and 1 , to define addition, multiplication and the ordering, 
and finally to prove that the axioms for the rationals hold. To help follow the 
argument we will start right away using the notation 

u — v 


m 


df n(m, u, v ) 
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as an abbreviation, i.e., without defining separately “subtraction” or “divi- 
sion”. 

The zero and the one are defined in the obvious way, 

0 = = df n(l, 0, 0), 1 = — j— =<«■ ^(1, 1,0). 

(3) Addition of rationals. With the representation of rationals as quotients 
of a difference of numbers by a number which we are using, the classical 
formula for addition of fractions takes the form 


u i — V\ 


Ul - V 2 


(m 2 u\ + m\u 2 ) — (jn 2 v\ + m\v 2 ) 


m i m 2 


m\ m 2 


So we define first on the set A the binary function / + which corresponds to 
this formula, 


/+((mi, u\,v\), (m 2 , u 2 ,v 2 )) = {m\m 2 ,{m 2 u\ + m\u 2 ),{m 2 v\ +m\v 2 )). 

With a bit of arithmetic we can prove that for all x, y, x',y' G A, 

x ~ x'&y ~ y' =$. f + ( x ,y) ~ f+{x',y'), 

i.e., ~ is a congruence for f + . It follows by A.2 that there exists a (unique) 
function 

+ : Q x Q — > Q 

which satisfies the identity 

n(x) + 7 z(y) = 7 z(f+(x,y)) (n(x),n(y) G Q). 

Verification of the axioms (F2) for addition needs a bit more of arithmetic, 
but at least the condition for 0 is obvious: 


7 z(m, u, v) + 7r( 1 , 0, 0) = 7 i(m ■ 1, 1 • u + in ■ 0, 1 • v + m ■ 0) = n(m. u, v). 

(4) Multiplication of rationals. Following the same method, we define first 
the function /. : A x A — > A which corresponds to multiplication when we 
represent rationals by triples of natural numbers, 

/.((mi, MI, Ul), (/77 2 , U 2 ,V 2 )) =df {m\m 2 , U 1 77 2 + V\V 2 , U\V 2 + u 2 v i), 

we verify next that ~ is a congruence for /. and we define the multiplication 
operation on fractions • by A.2 so that it satisfies the identity 

n(x) ■ 7t(y) = 7 z(f.(x,y)) (n(x),n(y) G Q). 

Verification of axioms (F3) and (F4) requires just a few computations. 

(5) Ordering of the rationals. The critical equivalence in this case is 

Ml — Ul u 2 — Vi 

< ni\v 2 + m 2 u\ < tn 2 v\ + ni\u 2 . 

in i m 2 

We first define the relation P C A x A by 

(. m\,u\,v\)P{m 2 ,u 2 ,v 2 ) m\v 2 + m 2 U\ <m 2 V\+m\ii 2 , 
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we verify that ~ is a congruence for P and using A.2 we define < on the 
quotient Q so that 

n(x) < n(y) •<=>■ xPy (n(x), n{ v) € Q). 

It follows easily that < is a linear ordering, and that the structured set 

(Q, 0,1, +,-,<) 

is an ordered field. 

It remains to verify that Q is a system of rational numbers. 

Lemma 1. For each natural number k, 7t(l,k.0) G Nq, i.e., the rational 
n(\,k, 0) belongs to the set of natural numbers of the ordered field Q. 

Proof. By induction on k, n{\, 0, 0) =0 (by definition) and (easily) by the 
definition of rational addition 

71(1,5^0) = n(l,k, 0) + 1, 

so that n{\,k, 0) G Nq => n(l,Sk, 0) G Nq. H (Lemma 1) 

Lemma 2. For cdl (m. u, v) G A, 

n[m, u, v ) = 7t(l, m, 0) _1 (7r(l, u, 0) — n(\,v, 0)), (A-6) 

where _1 and — are the multiplicative and additive inverse (partied) functions of 
the field Q. 

Proof. Having proved already that Q is a field, we know that (A-6) is 
equivalent to 

n(\,m. 0)n(m. u, v) + n(\,v, 0) = 7t(l, u. 0), 
and the latter identity is easy to verify with a direct computation.H (Lemma 2) 

The two Lemmas together show that the structured set (Q,0, 1, +,-,<) 
satisfies the characteristic property of the rationals and this completes the 
proof. H 

As we did with the natural numbers, we now fix a specific system of ratio- 
nal numbers (Q, 0, 1, +, •, <) whose elements we will henceforth call rationals. 
This is convenient, it helps avoid awkward expressions like “members of any 
system of rational numbers” and the like. However, it is important to em- 
phasize (once more) that the significant mathematical fact is the existence 
and uniqueness up to isomorphism of one such system: it was precisely the 
corresponding mathematical facts about the natural numbers that we have 
used in the proofs of this Appendix, not the specific identity of “the natural 
numbers”. 

A.17. Exercise. The set Q of rationals is countable. 

A.18. Exercise. In the proof of Part (1 ) of the theorem we “subtracted” the 
same number from an identity and then “divided” an identity by the same num- 
ber. Justify these steps by verifying the following two properties of the natural 
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numbers : 


x + y = x + z => y = z, 
c/0&c-x = c- y =$■ x = y. 

A.19. Exercise. For every ordered field F , there exists exactly one imbedding 
of the rationals in F , i.e., an injection 

71 : Qy-> F 

which satisfies the identities 

7t(0) = 0, 7l( 1 ) = 1, 

7i {x + y ) = n(x) + n(y), 7 i{xy) = n(x)n(y), 
x < y ■£=£> ti{x) < ; z(y). 

It follows that the image 7i[Q] C F of n is a system of rationed numbers ( with 
the 0 and 1 of F and the restrictions of the operations and the ordering of F). 

There is a beautiful theorem of Cantor which characterizes the ordering of 
the rationals independently of their algebraic structure. For it, we need first 
some definitions. 

A.20. Definition. Suppose < is a linear ordering on a set A and B C A. We 
call B dense in A if 

(Vx, y € A)[x < y => (3b £ B)[x < b &b < y]]. 

A linear ordering < is dense in itself if its field (A) is dense in A. 

A.21. Exercise. The ordering of every ordered field is dense in itself and has no 
minimum or maximum element. 

A.22. Theorem (Cantor). Any two countable, linear, dense in themselves order- 
ings (A, <a ) and ( B , <b) without minimum or maximum element are similar, 
i.e., there exists an order-preserving correspondence f : A >— » B. 

In particular, every countable, linear, dense in itself ordering is similar with 
the ordering <q of the rational numbers. 

Proof. The hypothesis gives us enumerations without repetitions 
A = {ctQ,a\,. . . B = {b 0 ,bi,...} 
of A and B . We will define by recursion a sequence 

/o../i 

with the following properties, for every n £ N. 

(1) /„ is a finite, partial function from A to B, i.e., 

Function)/,,) & /„ C A x B. 

and /„ is finite as a set of ordered pairs. 

(2) /„ is monotone and one-to-one on its domain, i.e., 

x,y £ Domain (/„)&x < A y=>/«(x) < B /„(y). 

(3) f n C f n+ i. 
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(4) {ao, a i, . . . , a„ } C Domain(/„). 

(5) {b 0 ,bu... ,b n } C Image)/*). 

If we can succeed in this, then the union / =<jf U n-(T/« ' s (easily) a function 
by (3), it is one-to-one and monotone by (2) and 

Domain)/) = A, Image)/) = B 

by (4) and (5). 

At the basis of the recursive definition we start with 

p 0 = {(ao-bo)}. 

so that all the conditions of the result hold trivially. 

Suppose now that we have already defined f„ and enumerate its finite 
domain of definition and image in increasing order: 

Domain)/,,) = D n = {x 0 < A x\ < ■ ■ ■ < A x m }, 

Image)/,,) = /„ = {j 0 <b Ji <b • • • <b y m }- 
Since /„ is monotone, we have 

fn(xj)=yi (i = 0,...,m). 

We construct the next f n+ \ in two steps, i.e., first we will define a partial 
function /' + l D /„ which satisfies (1) - (4) and then define /„+ 1 D f' n+1 
which satisfies all (1) - (5). 

Step 1. If a n+ i G Domain)/,,), set f' n+l = f„. Otherwise there are three 
cases. 

Case la. a n+ \ < A x 0 . In this case we find some y' G B satisfying y' < B y 0 
(which exists because B has no minimum) and set 

fn+i = /« U {(a„+i,y)}. 

Case lb. a n+ \ > A x m . In this case we find some y’ G B satisfying y' > B y m 
(which exists because B has no maximum) and we set 

fn + 1 =/«U {(«„+!,/)}. 

Case lc. For some i , x, < A a n+ \ < A x i+ \. In this case we find some y’ G B 
satisfying y t < B y' < B >’,+i (which exists because B is dense in itself) and we 
set 

fn + 1 =/»U{(a„+i,/)}. 

In all cases, the proof that f' n+l satisfies (1) - (4) is simple. 

In Step 2 of the construction we consider the element b n+ \ of B and we 
distinguish again cases: first if b n+ 1 G Imag e(/' +1 ) (in which case we set 
fn+l = /,', +1 ) and if not, then three cases again, in symmetry with Step 1. We 
skip the details. H 

The fundamental intuition about the real numbers is that on the one hand 
they are an ordered field, so that their arithmetic and ordering satisfy the 
same laws as the rationals, and on the other, they are in one-to-one corre- 
spondence with the points of the “complete” geometric line so that there are 
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no “gaps” between them. In formulating the property of completeness we 
follow Dedekind. 


A.23. Definition. A linear ordering < on a set A is complete if every non- 
empty subset of A which has an upper bound has a least upper bound. A 
system of real numbers is any complete ordered field, i.e.. any ordered field in 
which the ordering is complete. 


A.24. Exercise. The ordering of the rationals is not complete , because the set 

X ={r | r 2 < 2} 

is bounded from above but has no least upper bound. 

A.25. Lemma. Every complete, ordered field F has the archimedean property 
{fix € F)(3n € Np)[x < n], 


i.e., the set N = Np of its natural numbers is not bounded from above. 

Proof. Assume towards a contradiction that the set N has an upper bound, 
so that it has a least upper bound x = sup N by the completeness property. 
The element x — 1 is not an upper bound of N because x — \ < x. so there 
must exist some n <E N, x — 1 < n: but this implies x < n + 1 which contradicts 
the assumption that x is an upper bound of N. H 

A.26. Exercise. In every complete, ordered field F , 


e > 0 = 


■ (fin e 


1 

n + 1 


< e 


A.27. Exercise (Density of the rationals). In every complete, ordered field F , 


x < y => (3r e Q)[x < r&r < y]. 


where Q = Qp is the set of rationals in F . 

We now aim to show that there exists a complete ordered field and that any 
two complete, ordered fields are isomorphic. Since the completeness property is 
geometric (or topological), these proofs of existence and uniqueness depend 
on geometric ideas. Specifically, we will need some basic definitions and results 
from the theory of limits which is studied in Calculus. We will review these 
here, briefly but without limiting ourselves to the absolute minimum list of 
theorems necessary to prove the existence and uniqueness of the reals: we have 
included several Lemmas and Exercises because they support the proposition 
that the notion of a complete, ordered field represents faithfully our geometric 
intuitions about the real numbers. 

Since we will be manipulating (infinite) sequences a great deal, we will use 
consistently the familiar, simple notation which avoids the introduction of a 
distinct name for each sequence and treats the variable as an index, so that 
a sequence {n i— > a„) is denoted by ( a „ ) or (u 0 . a i, . . . ),. For example, the 
sequence 

-J-wI.U... 

n + 1 / \ 1 2 3 
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is the function / : N — » Q defined by the formula 


/(«) 


1 

n + 1 


The absolute value function is defined in every ordered field in the usual 

way, 


|.x| =df max{x, —x} 


x, if x > 0, 

— x, if.x<0. 


A.28. Definition. Suppose ( F. 0, 1, +, •, <) is an ordered field. 

(1) A sequence (x n ) of elements of F converges to x G F or has limit x, if 

(Ve G F,e > 0)(3A G N)(Vn G N)[« > K => \x — x n \ < e]. 

We will use the notation 


x n —> x ( x n) converges to x. 

(2) A sequence ( x „ } has the property of Cauchy, or (simply) is Cauchy if 
(Ve € F, e > 0)(3^f G N)(V«. m G N)[/i. m > K =» \x n — x m \ < e]. 
A.29. Definition. For all a < b in an ordered field F, the set 

( a , b) =df {.x G F | a < x < b} (A-7) 

is the open interval with endpoints a and b. A set G C F is open if it is a 
(possibly empty) union of open intervals, equivalently 

x G G => (3 a < b)[x G ( a , b) C G]. 

We will also use the standard notations for closed and half-open intervals, 


e.g., 


{a, b] = {x G F | a < x < b }, 

and also for the half-intervals with — oo or oo as one or both of their endpoints. 


e.g., 


(—oo ,b) = {x G F | .x < b), [a, oo) = {x G F \ a < x}. 


A.30. Exercise. Prove that the family of open sets in an ordered field is a topol- 
ogy and that the definition of limits for sequences in A.28 is equivalent to the 
topological definition of limits given in 10.42. 

These definitions are notorious for the difficulty of understanding what 
they mean and learning how to use them. We emphasize that here we study 
them in the context of an arbitrary ordered field which need not be complete, 
for example, the rationals. It is useful to formulate conditions equivalent to 
convergence and the property of Cauchy, on the basis of the following notion. 


A.31. Definition. A sequence ( x„ ) in an ordered field settles in an open interval 
( a , b), if after a certain stage all its terms belong to some closed subinterval 
[a' , b'] C ( a , b ): 

( x n ) (a,b) ^==4>df (3 K,a' ,b')(fin > K)[a < a' < x n < b' < b]. 
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Notice that if (x n ) {a. b), then all its terms after a certain stage belong to 

{a, b), 

( x„ ) {a, b ) => ) (V/r > K)[a < x n < b\, 

this weaker property is all we need for many applications of the definition of 

(x n ) (a,b). 

A.32. Exercise. (x„) (a. b ) if and only if there exists some 8 > 0 such that 

a + 8 < b — 8 and the set {n € N | x n £ [a + 8, b — d]} is finite. 

A.33. Exercise. For all open intervals I. J . (. x n ) I & / C J => (x n ) J . 

A.34. Exercise. For all open intervals I. J . 

(x n ) I & (x n ) J => (x„) / fl / ==> / n / 0. 

A.35. Exercise. Every sequence (. x n ) which settles in some open interval {a, b) 
Abounded, i.e.. (3 w)(\/n)[x„ < w\. 

The next Lemma makes it possible in many cases to avoid the so-called 
“method of epsilonics”, which is illustrated by its proof. 

A.36. Lemma. For every sequence in any ordered field F: 

(1) (x n ) converges to x if and only if (. x„ ) settles in every open interval which 
contains x, 


x„ —> x ■<=>■ (Va, b e E)[fl < x < b => (x„) {a. b)). 

(2) (x n ) is Cauchy if and only if for every e > 0, there exists an open interval 
{a, b) in which (. x n ) settles and such that ( b — a) < e: 

( x„ ) is Cauchy (Ve > 0)(3u. b)[a < b < a + e & (x n ) (a. b)] 

Proof. (1) If x n — > v and a < x < b, then the definition of convergence 
with 

min(x — a. b — x) 


supplies a number K such that 


n > K 


\x - x n \ < 


min(x — a,b — x) 


which with a bit of inequality massaging implies that 
n > K 


. a + x x + b , , , 

a < a = — - — < x n < — - — = b < b. 


so that (x„) ( a , b). For the proof in the other direction: for every e > 0, 

(x n ) (x — e, x + e), so that for some K, 


n > K => \x — x„ | < e. 

(2) If (x n ) is Cauchy, then for every e > 0 there exists some K such that 
n. m > K ==> \x n — x m \ < |, which immediately implies that 

. £ £ 

{ X n ) ^ [Xk - 2> X K + y)’ 
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and this interval has length e. In the other direction, for every £ > 0 there 
exists some {a, b ) with (b — a) < e, so that ( x n } {a, b ), and so for some K, 

n, m > K =y [a < x„ < b &a < x m < b =y \x n — x m \ < b — a < e], 
which means that (x n ) is Cauchy. H 

A.37. Corollary. If (x n ) converges to some x, then it is Cauchy. 

Proof. For every £ > 0, ( x„ ) -w (x — §,x + §) by (1) of A.36, so it is 
Cauchy by (2) of A.36. H 

A.38. Exercise. x„ — > x & x„ —> y x = y. This allows us to introduce the 
classical notation 

x = lim„ x n x„ x. 

A.39. Lemma. If (x„) is Cauchy in a complete, ordered field F, then (x n ) has a 
limit, i.e., x„ — > x with some x. 

Proof. Let 


X =df {u e F | (3u)[m < v&(x„) (u,v)]}. 

Since (x„) is Cauchy, there exists some (c, d) such that (x„) (c, d), and 

hence 

u £ X u < d, 

since d < u => (x„) (c, d) fl (; u , v) = 0 which is not possible; hence X is 

bounded from above and it must have a least upper bound 


x = sup X. 

We will show that 

a < x < b => (x„) (a, b) 

which implies x„ — > x by A.36. Using the hypothesis a < x < b and the fact 
that (x„) is Cauchy, we find first some (u, v ) such that 

v — u < min(x — a,b — x) , (x n ) -w (u, v). (A-8) 

By definition, u e X, and hence (1) u < x, because x is an upper bound 
of X. On the other hand, v is also an upper bound of X (because the 
assumption (x„) ( u ' , v') with v < u' implies that (x„) settles in two disjoint 

intervals), and hence (2) x < v, since x is the least upper bound of X. Now 
(1) and (2) together yield u < x < v, which together with (A-8) implies 
a < u < x < v < b, so that (x„) ( a , b). H 

The next two basic theorems relate completeness as we defined it (following 
Dedekind) with the notion of completeness historically associated with the 
name of Cantor. 

A.40. Theorem (The nested interval property). Suppose that every Cauchy se- 
quence in an ordered field F converges, and that 

[xo,yo] 5 [xi,yi] 2 • • • (A-9) 

is a nested sequence of closed intervals such that 

lim „{y„ - = 0; 


(A-10) 



214 


Notes on set theory 


it follows that the intersection P| n [x n , y„] is a singleton 

r\ n [x n ,y n ] = M- 

and its only member is the common limit of the sequences (x n ) and (>’„), 

w = lim„ x n = lim„ y„ . 

Proof. The basic observation is that for every number K and every S > 
0, (x n ) [x K — S,y K + <S] by (A-9). Now (A-10) implies that (x„) is 
Cauchy and hence x„ — > x for some x, using the hypothesis. Moreover, 
x < xk => (x n ) (x — 1 ,xk) by A.36, which contradicts the basic properties 
of the relation since (x — 1 , xk) n [xk , yk\ = 0. hence, xk < x, for every K. 
By a similar argument x < )>k, for every K, so that in the end x G f] n [x n , y n ]. 
Symmetrically, ( y „ ) converges, y = lim n y n G f\ n [x n ,y n \, and for every n, 
\x — y\ < (y n — x„) which implies x = y by (A-10). H 

A.41. Theorem. An ordered field F is complete if and only it has the archimedean 
property (A.25) and every Cauchy sequence in F has a limit. 

Proof. One direction is known from Lemmas A.25 and A.39, so it is enough 
to show that if F has the archimedean property and every Cauchy sequence 
in F converges, then F is complete. 

Suppose then that A is a non-empty, bounded from above set in F, so that 
there exists some point x 0 G X and some upper bound y 0 of X. Beginning 
with [xo,yo]. we define by recursion a sequence of closed intervals [x„ . y n J 
which satisfy the following conditions: 

1- x n C x n — ] < y n — i G V/i . 

2. (>’„ - x n ) = 2~ n (y 0 - x 0 ), 

3- [x„, y n \ D X f 0, 

4. (Vx G X)[x < 

In detail, to define [x„+i, y n +\] we distinguish two cases: if w = \(x n + y„ ) is 
an upper bound of X, we set [x n+ \, y n +\\ = [x„,w], otherwise [x„ + i.p„+i] = 
[uk y n + 1 ]. Proof that [x n+ \,y n +\\ satisfies (1) - (4) is trivial. 

Lemma. Iim„ (>'„ — x n ) = 0. 

Proof. The archimedean property implies that for every e > 0, there exists 
some natural number K > 0 such that 

— - — < K < 2 k . 
s 

where the inequality K < 2 K is verified easily (by induction!). It follows that 
for every n > K. 

( yn - X„) = 2~"(po - A'o) < 2~ K (y 0 - x 0 ) < e, H (Lemma) 
Now A.40 implies that 

f| [*B.Tn] = {w} 
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where w = lim„ x n = lim„ y n . and it is enough to verify that this common 
limit w is the least upper bound of X. We compute: 

w < t =>■ (y„) (w — 1, t) because lim„ y n = w , 
y n < t for some n. 

t ^ X because y n is an upper bound of X , 

so that w is an upper bound of X. Also, 

t < w (x„) {t,w + 1) because lim„ x n = w, 

=> t < x n for some n, 

(3x G X)[t < x] by the definition of x n , 

so that there is no upper bound of X smaller than w. H 

It is worth pointing out here that there exist Cauchy complete ordered fields 
which are not archimedean. Problem xA.2. 

Following Cantor (up to a point), we will construct a complete, ordered 
field as a quotient of the Cauchy sequences on the rationals by the following, 
natural equivalence relation. 

A.42. Definition. We call two sequences of rationals (x„) and (y n ) asymptoti- 
cally equivalent if their difference converges to 0, in symbols, 

(x n ) ~ ( yn ) ''df (Am y n) ' d. 

A.43. Theorem. (1) Two Cauchy sequences ( x„ ) and ( y n ) are asymptotically 
equivalent if and only if they settle in the same open intervals : 

(x n )~(y n ) if a < b)[(x n ) (a,h) <==> (y n ) {a,b)]. 

(2) If (x„) and ( y n ) are Cauchy , then 

( x n ) ( y„ ) => (there exist open intervals I, J) 

[/ fl / = 0 & (x n ) I & ( y n ) /]. 

(3) The relation « is an equivalence relation on the set 

&(F) =df {(x„) | ( x„ ) is Cauchy on F}. 

Proof. (1) Suppose first that (x„) « (y„) and ( x„ ) ( a,b ), so that for 

some K 0 and some S > 0, 

n > Kq =$■ a + S < x n < b — S. 

Using ( x n ) ~ (y«), choose K\ such that 

S 


n > K\ 


\x„ - y„\ < 


X n - - < y„ < x„ + 


From these two implications we get easily that 
n > max( K 0 , K \ ) =^> 


d , S 

2~ yn ~ b ~r 
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so that (y n ) {a, b). In the other direction, for every e > 0 there exists an 
open interval ( a , b ) with b — a < e such that (x„) (a, b ) by (2) of A.36, so 

by the hypothesis, we also have 

n > K => [a < x n < b &a < y n <b\ => \y n — x n \ < s. 

(2) Directly from the definition of convergence, 

-'[(■*« - y n ) ->■ 0]=> (3e > 0) (VK)(3n, m > K)[\x n - y m \ > e], (A-ll) 

By A.36, there exist open intervals / and J of length < | (with this e) such 
that (x n ) / and (y n ) J. If some z G IC\J existed, then for K sufficiently 
large so that 

n, m > K => x„ G I & y m G J. 

we would have 

G G 

n. m > K => \x„ — y m \<\x„-z\ + \z - y m | < - + - = e, 

which contradicts (A-ll). 

(3) The reflexivity and symmetry of « are trivial. To show its transitivity, 
notice that by hypothesis and ( 1) of Lemma A.36, for every {a, b), 

{x„)~*(a,b) {y n )~*(a,b) {z„)~*(a,b), 

so that by (1), (x„) « (z n ). H 

A.44. Exercise. If (x n ) and (y n ) are both Cauchy and x„ — > x, then 

(x„) « (y„) y n -> x. 

At this point we could appeal to the existence of some quotient B of the set 

= 5?(Q) =df {(?'„) | (r n ) is Cauchy in the field of rationals} 

by w and define the necessary functions and an ordering on B so that it 
becomes a complete, ordered field. This is one of the classical proofs of the 
existence of the real numbers, connected with the name of Cantor. Instead of 
this, we will construct a specific quotient of ^ by « which simplifies the proof 
a bit and (more significantly) relates this construction with the other classical 
proof of the existence of the reals, following Dedekind. The basic idea of 
Dedekind was that a real number x is completely determined by (and hence 
can be “identified” with) the set 

(- 00 , x) n Q =df {r G Q I r < x} 

of all rationals preceding it, and that the sets of the form (— oo, x) n Q can be 
characterized directly by three simple conditions. 

A.45. Definition. A Dedekind cut is any set X of rational numbers which 
satisfies the following three conditions: 

1. X^0, (Q\X) ^0. 

2. r < q & q G X =>■ r G X . 

3. q G X => (3 r)[q < r &r G X], 
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We set 

3S =df {X C Q | X is a Dedekind cut}. 

A.46. Exercise. A set X C Q is a Dedekind cut if and only if it is non-empty , 
bounded from above , with no largest member and “downward closed”, i.e., such 
that r < q &q £ X => r £ X. 

The next theorem is basic for the proof of existence of the real numbers. 
A.47. Theorem. For each Cauchy sequence of rationals (x n ), let 
n({x„)) = df {a £ Q | (3 b)[a < b&(x„) {a,b)]}‘, 

it follows that each value i i((x„)) is a Dedekind cut and that the function 

n : ^ — » 3S 

is a surjection which determines the equivalence relation so that 3! is a 
quotient ofW. 

Proof. That each n((x„)) is a Dedekind cut is quite easy from the defini- 
tions and the general properties of the relation and the equivalence 

(■ x„ ) w (. y„ ) <=> n{(x„)) = n((y„)) 

is an immediate corollary of A.43. The only thing which is not completely 
obvious is that for every cut X there exists a Cauchy sequence (x„) G r C in the 
rationals such that n{(x„)) = X. For this we construct a nested sequence of 
closed intervals in the rationals 

[xo,yo] 2 [x\,y\\ in- 
exactly as in the proof of A.41, beginning with some xq G X and some 
yo j X , so that, in fact, >’o is an upper bound of X. We argue as in A.41 that 
the non-decreasing sequence (x n ) is Cauchy, and that, in addition, for all n, 

x n £ X & y„ £ X, 

because X is downward closed and has no largest member. Then we compute: 

a £ 7i((x n }) ==> (3 b)[a < b & (x„) (a, b)] 

=> (3n)[a < x n \ 

=» a £ X because x„ £ X. 

To see that X C n((x„)), notice that if a £ X , then there must exist 
some natural number K such that a < xk, because the opposite supposi- 
tion (V«)[x„ < a] implies (easily) that a is the largest point in X , and X 
does not have a largest point. Thus, for n > K, x K < x n < vk- hence, 
(x n ) -w {x k .}’k + 1) and a £ n((x„)). 3 

A.48. Theorem (Existence of the real numbers). There exists a complete, or- 
dered field. 

Proof. For the domain of the required field we take the set 3! of Dedekind 
cuts and for 0 and 1 we take the obvious: 

0 =df {r £ Q | r < 0} = n{( 0)), 

1 =df {r £ Q | r < 1} = 7i((l)). 
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In order to define addition and multiplication on 3! we need the following 
two Lemmas, where all the sequences are Cauchy in the rationals. 

(x n ) w ( x '„ ) & (y„) « (y'„) => (x„ + y n ) « (x' + y'} (A- 12) 

(x n ) w (x') & (y„) « (y') => (x„ • y„) « (x' • y' n ) (A-13) 

These are the useful interpretations of the classical theorems from the theory 
of limits, 


lim„ (x„ + y„) = lim„ x„ + lim„ y„, 
lim„ (x„ • y„) = lim„ x„ • lim„ y„, 

in the case at hand, when the limits need not exist since the sequences are 
in the incomplete ordered field of the rationals. They are not hard to verify 
after all the preparatory work we have done and we will skip the details. The 
implications (A-12) and (A-13) assert that « is a congruence in ?? for the 
functions 


«x„), (y„)) ^ (x„ + y„), 

((x„), (y„)) i-» (x„ • y„), 

so that by A.2 there exist functions + and • on the quotient 3S which satisfy 
the identities 


n{(x„)) + 7t((y„)) = n((x„ + y„)), 
n({x„}) ■ n((y„}) = n({x„ ■ y„)). 


We take these + and • for the operations of addition and multiplication in 2. 

Next we must show that (3f, 0, 1, +, •) is a field, but this part of the proof is 
quite trivial, if a bit tiring in its details (which we will skip). The existence of 
additive inverses, for example, follows from the obvious 

7r((x„)) + 7l((-X„)) = 7 t((x„ + ( — X„))) = 7t((0)) = 0. 


where the only “delicate” point is the observation that if (x„) is Cauchy, then 
so is (— x„). To check the corresponding property for multiplication, given a 
Cauchy sequence (x„) ^ (0), set 


yn — df 


l/x„, if x„ ^ 0. 
1, if x„ = 0. 


(A-14) 


verify that (y„) is also Cauchy and then compute: 

n{{x„)) ■ 7 t((y„)) = 7r((x„ • y n )) = 7j((l)) = 1. 
The basic observation (from (2) of A.43) is that 


96(0) 


(3(5 > 0)[(x„) (— A<5)] 

(3S>0,K)(\/n > K) \x„\ >S 
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with which we begin the proof that (>’„) is Cauchy, but some epsilonics are 
unavoidable. The related result from the theory of limits is the assertion 


lim„ x n ^ 0 = 


lim„ ( — ) = 

. x. 


1 


lim„ x„ 


traditionally known as the first hard theorem in Calculus, when it is taught 
rigorously. 

Next we define on 9f the relation 


A < T ^=> df ACT 


which is certainly a partial ordering; it is also linear, because for any two 
Dedekind cuts X and T, directly from the definition, 

r G ( T \ X) => {Mq G X)[q < r ] & (\/q < r)[q G Y] 

=> ACT 

and, of course. A / T (3r)[[r G (A \ T)] V[r G (T \ A)]]. Appealing 
once more to the definition of Dedekind cuts, we can also show easily that 

I c Sr=»lJ/ = QVLU G 3f. (A-15) 

(For example, if r were the largest point in the union (J I, then r G A for some 
A el, and then r would also be the largest point in A, since A C / — but A 
has no largest member.) From (A-15) we infer that every set / C 9S which is 
bounded above has a least upper bound, because 

(VA G 7)[A < Z]=»U^ czc Q=»U^ e S, 

and the union (J / is obviously the least upper bound of / in the relation 

< = C. 

It remains to verify that for all A, T Z G 3S 

X < Y => A + Z < T + Z, (A-16) 

[Z > 0 & A < T] => Z • A < Z • T (A-17) 

since these implications imply then immediately their versions with <. Consid- 
ering the more difficult (A-17) as an example, choose first Cauchy sequences 
such that 

n{{z„)) = Z, 7 z((x n )) = A, 7 r((y„)) = T 

and verify (easily) from the definitions (and A.43) that there exist rationals 

x° < x 1 < y° < y l 


satisfying 

{x„) (■ x°,x l ), {y„) (j 0 ,}' 1 ), 

and that for each e > 0 we can find some z° and z l such that 

0 < z° < z 1 , (z 1 - z°) < e, (z„) (z°, z 1 ). 

It follows that 


(z„ x n ) (z°x°, z 1 x 1 ), ( z n y „ ) -W (z°y°, z 1 }’ 1 ), 
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and the desired conclusion Z X < Z Y will follow quite easily, if we could 
choose z°, z 1 so that 

rV < z°y°, 

or equivalently, 

x'(z l - z°) < z°(y° - x 1 ). (A-18) 

Now (A-18) is obvious if x l < 0, because in that case x x (z x - z°) < 0 and 
z°(y° — A' 1 ) > 0. If x 1 > 0, we find first some S > 0 such that (z„) (d, oo) 

and then z°, z 1 such that 

0<3<z°<z 1 , (z n ) (z°, r 1 ), (z 1 — z°) < — — ] — - 

which imply (A-18): 

x 1 (z 1 - Z°) < X X —^——r—— < z°(y° - A 1 ). 

X 1 

Verification of (A-16) is substantially simpler and completes the proof of the 
theorem. H 

A.49. Exercise. Prove that for all Dedekind cuts X , Y and Z: 

X ■ (Y + Z) = X ■ Y + X ■ Z. 

( Use the formal definitions of + and ■ given in the proof of A.48.) 

A.50. Exercise. Prove (A- 12) and (A- 13). 

A.51. Exercise. Show that if ( x „ ) is Cauchy in an ordered field and (x n ) (0), 

then the sequence ( y n ) defined by (A-14) is also Cauchy and (x n y n ) « (1). 

A.52. Theorem (Uniqueness of the real numbers). For any two complete , or- 
dered fields F\ and /A there exists exactly one bijection 

n* : Ei >-» F 2 

which is an isomorphism , i.e.. 

1. w*(0)=0, w*(l) = l, 

2. n*(x + y) = 7i*(x) + n*(y), n*(xy) = n*(x)n*(y), 

3. x < y n*(x) < n*(y). 

Proof. By the uniqueness of the rationals A.14, there exists (exactly one) 
isomorphism 

n : Qi >— » Q 2 . 

where Qi, (Qb are the sets of rationals in the two fields F\ , F 2 , and the problem 
is to extend this n to the whole of F\ . 

Lemma 1. For each x € F\, there exists a sequence ( x „ ) of rationals in F\, 
such that lim„ x n = x. 

Proof. Using the density of the rationals (Exercise A.27), we can find for 
each n e N a rational x„ G Qi such that \x — x n \ < 1 /{n + 1), and then (easily, 
using problem A.26) lim„ x n = x. H (Lemma 1) 
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Lemma 2. For each sequence ( x„ ) of rationals in F\, 

(3a G Fi)[lim„ = a] => (3a* g F 2 )[lim„ n(x„) = a*]. 

Proof. We know that 

u < v n(u) < ; :(v) (u,v G Qi) 

because n is an isomorphism, so that for all a, b G Qi, a < b, 

(a „)~*( a,b) <=3- (n(x„)) (n(a),n(b)). (A-19) 

If (a„) converges, then it is Cauchy, so that (n(x n )) is also Cauchy by (A-19) 
and A.36 (using A.26 once more), and therefore (n(x„)) converges because F 2 
is complete. 3 (Lemma 2) 

We can use the same simple idea to verify the third basic fact we need, whose 
proof we will omit: 

Lemma 3. For any two Cauchy sequences in the rationals of F\ . 

[lim„ x n = lim„ y n \ => lim„ n(x„) = lim„ n(y„). 

The Lemmas guarantee that we can define unambiguously for each a g F \ , 
7i* (a) =df lim„ 7i(x n ), where lim„ x„ = x, with x n G Qi. (A-20) 
Since for each rational agQi, lim„ a = a, we have 
71* (a) = lim„ 7 t(a) = 7t(a), 

so that n* is an extension of n. It remains to verify that tz * is an isomorphism 
of F\ with F 2 . 

Suppose first that 

a = lim„ x n , y = \im n y n , x n , y n G Qi , x < y. 

This implies immediately (by A.36) that there exist rationals a. h. c and d 
satisfying 

(a„) (a, b). (y„) (c,d), b < c, 

and hence by (A-19) 

(7i ( a,,)) (n(a),7i(b)), (7i(y„)) ( 71 (c). 71 (d)). 

It follows that 

tz* (a) = lim„ n(x„) < lim„ n(y„) = 7i*(y) 

because n{b) < 71 (c), and this completes the proof that n* respects the strict 
relation <: 

a < y=>n*(x) < n* (y). 

Directly from this, n* is an injection and it respects the ordering, 

a < y Ti* (a) < n* (y). 
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The rest is trivial (if tiresome) and follows mostly from the limit theorems of 
the Calculus, which hold in every complete, ordered field — and can be proved 
easily with the tools we have developed. As an example: 

K* ( x + y) =df lim„ n(x„ + y„), where x„ — > x, y„ — > y, x „ , y n € Qi , 

= lim„ [n(x„) + n(y„)] 

= lim„ jt(x„) + lim„ n(y„) 

= n*{x) + 7i*(j0- 

The crucial step in this computation is the identity 

lim„[7i(x„) + n(y»)] = lim„ n(x„) + lim„ n{y„). 

We skip the details. H 

A.53. Exercise. Work out the details of the proof of 
71* (x + )’) = 71* (x) + 7t*(y). 

A.54. The real numbers. As we did for the natural numbers and the rationals, 
we now fix some complete, ordered field 

(R.O. 1, +,-,<), (A-21) 

whose members we will call real numbers. We emphasize once more that fixing 
some R is only a convenience and the specific choice of R is of no importance: 
the fundamental mathematical fact for the development of analysis is that a 
complete, ordered field exists and that any two complete, ordered fields are 
isomorphic. 

A.55. Exercise. Prove Corollary 2.14 in Chapter 2 from the axioms. 

The open sets of real numbers defined in A.29 form a topology by the easy 
Exercise A.30, so we have notions of Borel sets of reals and Borel measur- 
able functions from R to other topological spaces and vice versa, by Defi- 
nitions 10.25 and 10.39. The notion of Borel isomorphism was defined in 
Definition 10.40. The next theorem makes it possible to transfer results about 
Baire space to the reals, and it is the main tool for analyzing the set theo- 
retic properties of R. We will omit its proof, which is quite simple, using 
Problems xl0.12 and xl0.13. 

A.56. Theorem. As a topological space , R is Borel isomorphic with the Baire 
space N, and , in particular , R = c N. 

With the true (von Neumann) cardinals developed in Chapter 12 from ZFC, 
this gives the congenial equation |R| = c = 2 N °. 


Problems for Appendix A 

*xA.l. Let F be the set of all real, rational functions, i.e., all (partial) real 
functions which can be represented as quotients of polynomials with coeffi- 
cients in R, and prove that it is a field with the obvious algebraic operations. 
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Show that the relation 

/ < g «=>df (3x) (Vp > x)[f{y) < g(y)] 

is a linear ordering of F, and with it F is a non-archimedean ordered field. 
Hint. The field Qf comprises the constant functions with rational value, and 
the identity ( x i— > x) is above all of them. 

*xA.2. Prove that there exists an ordered field which is not complete, but in 
which every Cauchy sequence has a limit. Hint. Show that every ordered 
field has a Cauchy completion. 

xA.3. Every open set of reals is a countable union of disjoint open intervals. 

xA.4. Every closed interval of real numbers [a, h] is compact, in the topolog- 
ical sense, 10.43. 

xA.5. Every closed set of real numbers is a countable union of compact sets. 

xA.6. Every closed set of real numbers F can be written uniquely as the 
disjoint union of a perfect and a countable set. 

*xA.7. Prove Theorem A.56. 
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AXIOMS AND UNIVERSES 


The serious study of models of axiomatic set theories depends heavily on meth- 
ods from mathematical logic which are outside the scope of these N otes. 32 Here 
we will consider only Set Universes, generalizations of the Zermelo and the 
ZFDC-universes of Chapter 11, which are very special models and can be 
studied by standard mathematical techniques, as we study fields or topolog- 
ical spaces. First we will prove that the Zermelo universes of Chapter 11 
ARE MODELS OF ZDC and the ZFDC -UNIVERSES ARE MODELS OF ZFDC; this 
will give us a better understanding of these universes, and it will also yield 
some simple Consistency and Independence results for the corresponding 
theories. In the main part of this chapter we will construct some new set uni- 
verses with quite different properties, including the Antifounded Universe 
of Aczel which contains a rich variety of ill founded sets. We will glean some 
consistency results from these models too, but consistency results are not our 
main concern: our primary interest is to explore and understand several nat- 
ural, intuitive notions of set and compare them with the standard conception 
of pure, grounded set discussed in 12 . 34 . 

We begin with a result about the least Zermelo universe Z which is somewhat 
surprising, given how much we promoted Z in 11.25 as a rich collection of 
sets which contains all objects of interest of classical mathematics. 

B.l. Theorem. The set V w of all pure, grounded, hereditarily finite sets is not a 
member of Z\ in fact, there is no set A £ Z such that 

0 G A&.ifX)[X G A=>V(X) G A], (B-l) 

Proof. For each v e Z, we let 

level(x) = the least n such that x G Z„, (B-2) 


,2 For those who do know logic, we remark here that Ihe most natural way to formalize the 
theories we have studied is in a many sorted predicate logic with identity, with separate variables for 
objects, definite conditions and definite operations of every arity. and relation symbols Set and g. 
Notice that we did not assume in 3.18 any extensionality principles for conditions or operations, 
and we have never appealed to such principles. This means that to get (a notational variant 
of) the classical Godel-Bernays Theory from the ZFC of 11.31 we must add extensionality for 
conditions. On the other hand, precisely because we did not include any extensionality axioms for 
conditions, we can view the present ZFC as a notational variant of the classical Zermelo-Fraenkel 
Theory, by interpreting “conditions” to be just “formulas with set parameters”. 
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so that members of No have level 0, but 

level(x) > 0=+level({x}) = level (x) + 1; 

because if x G Z„, then {x} G Z„+i, and if {x} G Z n with n > 0, then 
{x} C Z„_ i by (11-20), which implies x G Z„_ Define now by recursion 
the sets 

-4o = {0, {0}}^ A n +i = {A,,}. 

Clearly, each A„ G V co , level(+o) = 1, since A 0 C No but A 0 £ No, and by 
induction, level(+„) = n + 1. 

Suppose now that there is some A G Z which satisfies (B-l), and notice that 
A o G G A, and then by induction, for each n, A n G A. If A € Z m , 

then each A„ G Z m because Z m is transitive, and hence level (d„) < m for 
each n, which violates what we just proved. 3 

A similar argument shows that (with Kuratowski tupling) (J ^ 2 {No}" ^ 2, 
Problem xB.2, and it is quite easy to extend it to show that Z misses many 
more very simple sets, but the fact that it lacks V w is undoubtedly the most 
startling of the lot. The construction of V w is so direct, it seems to follow so 
naturally from our most basic intuitions about sets, that it is really hard to 
believe that we developed all this set theory in Chapters 3-10 and Appendix 
A from axioms which do not guarantee its existence. One may try to write 
off this feature of the Zermelo axioms as a small oversight of Zermelo and 
strengthen the axioms in some minor way to ensure the existence of V m , but 
this is the wrong way to go. On the one hand, we know the natural extension 
of ZDC+AC, it is the addition of the Replacement Axiom which can be 
justified by arguments only marginally different from those used to justify the 
Separation Axiom. And on the other hand, the importance of ZDC+AC 
lies precisely in its two, contrasting features: that it can prove so much about 
classical mathematics (which is its real domain), while it can be interpreted in 
such simple, easy to comprehend models like Z. Whatever doubts may have 
lingered about the soundness of set theory from the paradoxes should be at 
least moderated by the strength of ZDC+AC and its ease of interpretation. 

We still need to make precise the sense in which Z is a “model” of ZDC. 
Perhaps the simplest way to introduce the key, new idea we need, is to try and 
reinterpret B.l as an independence result. 

B.2. Theorem? 11+ cannot prove in ZDC the proposition that some set A exists, 
which contains the empty set and is closed under the powerset operation. 

Proof. Spelled out symbolically for precision, the proposition in question 
is 

(j) ^ df (3+)[0 G A & (VX)[X G A => V(X) G A]]. 

Suppose, towards a contradiction, that is a theorem of ZDC. Since the least 
Zermelo universe Z has all the closure properties demanded by the axioms of 
ZDC, any proof of <f> from these axioms could be translated into a proof of 
the interpretation of (j) in Z, which is 

^ z ) (3 A G Z)[0 Gd&(VlG Z)[X G A => V(X) G A]]. 
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As a consequence, (f Z) is true, so there exists a set A £ Z satisfying 
0 G A&{VX G Z)[X G A => V{X) G A], 
which by the transitivity of Z is equivalent to 

0 G A&(\/X)[X G A=>V(X) G A]; (B-3) 

but by B.l, no A G Z satisfies (B-3). H 

The argument is unfamiliar, unless you know a lot of logic, and in any case 
it is incomplete, only a sketch. What we need to elucidate is the meaning 
of “interpreting a proposition in Z”, the move from <j> to (f Zi above, and 
it may help to consider an example. Suppose we have located a traditional 
mathematician (perhaps an old-fashioned analyst) who disclaims any interest 
in general set theory beyond its applications to classical mathematics, he has 
studied Chapters 3-10 and Appendix A and he is convinced that all the 
objects he cares about live in Z. In an effort to simplify his world, he declares 
that henceforth by “object” he will mean “member of Z”: that is his universe. 
Suppose further that this person now utters the Powerset Axiom and claims 
that he believes it. What does he mean? Spelled out in terms of the primitive 
notions of sethood and membership, the powerset axiom reads as follows: 
(IV): For each set A, there exists a set B, such that for every X, 

X £ B 4==> Set (A) & (V?)[t G X => t G A], (B-4) 

Replacing “set” by “ member of Z” in this, we get something rather different. 

(IV) : For each set A G Z, there exists some set B G Z, such that for every 
X G Z, 

X G B Set(X) & (V? G Z)[t G X ==> t G A], (B-5) 

This is what our friend really means when he tries to tell us that every set has 
a powerset, and it differs enough from the Powerset Axiom that its truth is 
not immediately apparent. To prove it, for each A G Z, we naturally take 
B = P(A), which is also in Z and satisfies (B-4), for every X. Notice next 
that for each X G Z, 

(V? G Z)[t G X => t G A] (Vr)[? G X => t G A], 

easily, by the transitivity of Z. This reduces (B-5) to 

X £ B •<=> Set (35) & (V?)[f G X => t G A], (B-6) 

which is true for every X and hence (in particular) for every X £ Z. 

As a matter of fact, all the axioms of ZDC yield true propositions when we 
replace in them “object” by “member of Z”. It follows that all the theorems 
derived from the axioms of ZDC by logic alone also yield true propositions 
when we understand them as assertions about Z in the same way, so our 
stipulated classical friend can safely work in ZDC and be assured that he 
is proving statements which are true of his world. This not entirely trivial 
proposition expresses the fact that the universe Z is a “model” of ZDC. 



228 


Notes on set theory 


B.3. A set universe At is any triple M,S,E, of a class (which may be a 
set) M , a subclass S C M , and a binary definite condition E such that 
E (x, y) => x, y G M and for each x G M , 

b M {x) =df {t I t E x} is a set, (B-7) 

which means that for some set X and all t, 

t g x E(t, x). 

We write synonymously 

tEx E(t,x) 

for the condition E, which interprets the G condition in At, and we call b M M 
the body of each x G M. The At -objects are the members of M and the As- 
sets are the members of S. An n-ary definite condition is an At -condition if it 
only holds of objects in M, i.e., 

P(x i, . . . , x n ) => xi, . . . , x„ G M; 

and a definite n-ary operation F is an At-operation if it assigns At -objects to 
At -objects, i.e., 

x\ ,x n G M=$-F(x i ,x n ) G M. 

A universe At is natural if the class M is transitive and 5. A are the standard 
sethood and membership conditions, i.e., 

X £ M =>X CM, 
x G S x G M&Set(x), 
x E y ■<==>■ x, y G M & x G y. 

A natural universe At is completely determined by the transitive class M of its 
objects and we will identify it with that class, i.e., when we refer to the universe 
M for a transitive class M . we will mean the natural universe with objects the 
members of M. Notice that in a natural universe the body of each object is 
the set of its members. 

b M {x ) = {t | t G x} (x G M)\ (B-8) 

this means that b M (x ) = x. if x is a set. but b M {x) = 0 if x is an atom. 

The relativization to a universe At of a proposition 0 is the proposition 
6 {M) constructed by replacing “object” by “ATobject”, “set” by “At-set”, 
“x Gy” by “x E y”, “condition” by “At -condition” and “operation” by “At- 
operation ”, consistently wherever these expressions occur in 6. If 0 ( - M) is true, 
we say that 6 is true in the universe At or (synonymously) that At satisfies or 
models 0. 

A universe At is a model of an axiomatic system T if every axiom of T 
is true in At, i.e., if the relativization 0 <M ' of every axiom 6 of T is a true 
proposition. If this holds, we call At a universe of T, or simply a T-universe. 
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B.4. Propositions and relativizations. By proposition we mean any ordinary, 
definite mathematical statement or assertion, just like the axioms, hypotheses 
and theorems we have considered so far. This is not completely precise, every- 
day mathematical English not being a perfectly specified language. The basic 
idea is that for all objects x and y, the expressions x = y, x G y and Set(x) are 
propositions; for each definite condition P and objects xi, . . . , x„, the expres- 
sion P(x i x n ) is a proposition; and that propositions may be combined 

by the basic operations of logic, -i, V, 3 and the like. Relativizations 33 to a set 
universe M are computed as one might expect, e.g., 

\x=y) :x = y 9 

Set(x)^' M) : x G S, 

(x G y) (M) : xE y, 

P(x i,... ,x„) (M) : P(x , x n ) , 

(-. e) [M) ; -.( 0 ^), 

((/>& & xp^ M \ 

{{\/x)(/)) ( ' M ' > : (Vx G M)(j) {M) , 

{{VP)<t>) (M) : (VP)[(Vx, y)[P{x, y) => x , y G M] => 

where P varies over binary, definite conditions, etc. In the proofs which follow, 
we will take care to spell out laboriously the relativizations of all the proposi- 
tions that concern us. This will produce many examples which illustrate the 
notion, and it will also ensure that the specific results we claim will be precise 
and rigorously established even if the general notion of proposition and the 
relativization process are not precisely delimited. On the other hand, much 
of the significance of the results comes from the following general principle. 
It says simply that logical consequences of true (in M) propositions are true 
(in M) for any universe M, and, of course, it holds for arbitrary models of 
arbitrary axiomatic theories, not just set universes. 

B.5. Principle of Soundness of Logical Inference. If a proposition 9 is a theorem 
of an axiomatic system T ( i.e ., it can be proved by logic alone from the axioms 
of T ), then every universe of T satisfies 9. 

B.6. Universes vs. general models. According to the discussion in 8.22, to 
define a model of an axiomatic set theory we must specify a domain of objects. 


,3 To those who know formal logic, it might appear that the completely precise (syntactical) 
relativization operation on formulas is much easier to understand than this relativization opera- 
tion, which is applied directly to propositions of “mathematical English”. But to apply the formal 
relativization process, we must first “formalize” the ordinary language propositions in which we 
are ultimately interested, and a moment's thought will convince you that the formalization process 
is exactly as “vague” as the present operation of relativization. It may be argued that we know 
how to formalize a certain proposition precisely if we can interpret it in some arbitrary structure, 
i.e. , precisely if we can understand its informal relativizations. 
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define on it the conditions of membership and sethood and also specify which 
conditions and operations on the domain will be considered definite. Set 
universes are very special models in two ways. 

(1) When we view a set universe At as a model for an axiomatic set theory, 
we take its definite conditions to be all the definite conditions of our basic 
domain W, and we take for its definite operations all the At -operations, i.e., 
the definite operations of W which take At-objects to At-objects. 34 It is 
routine to verify that all the axioms for definite conditions and operations 
listed in 3.18 hold with this interpretation, Problem xB.l: thus, to prove 
that a set universe At is a model of (say) ZFDC, it is enough to prove the 
relativizations to At of axioms (I) - (VIII). 

(2) Because we assume that the body b M (x) of each x G M is a set, At-sets 
cannot be “larger” than the sets of the intended universe VV: for example, 
there cannot be an At -set x such that for all t, t Ex. 

Natural universes are even more special, of course, as they only restrict the 
domain of objects to some transitive class — and this makes it especially simple 
to understand the meaning of the relativization operation for them. From the 
mathematical point of view, natural universes are subuniverses of W and they 
inherit their structure from >V. much like subgroups, subposets, topological 
subspaces and the like are specified by a subset of some given space and inherit 
the relevant structure from it. The additional subtlety here is that we need 
to interpret (relativize) in these subuniverses propositions which are logically 
quite complex, more complex than the typical identities or inclusions which 
come up in Algebra or Topology. 

We begin with the verification of the axioms we have been studying in 
natural universes which we have already introduced, where the notions are 
most familiar. 

B.7. Lemma. Every transitive class M satisfies the Axiom of Extensionality. 

Proof. The relativization to M of the Extensionality Axiom reads as fol- 
lows: 

(I) A) Extensionality: For all sets A. B G M, 

A = B (Vx G M)[x €A x G B], 

If ^4 = B, then A and B have the same members, so they certainly have the 
same members in M. To prove the converse, we must show that if A f B, 
then there must exist some x G M, such that either x G A \ B or x G B \ A: 
but if A fi B, then there certainly exists some 

x G (A \ B) U {B \ A) C A U B. 

A. B C M because M is transitive, and hence this x is also in M. H 

34 More pedantically, the definite conditions of M are the restrictions to M of the definite 
conditions in the intended universe W, and similarly for the operations. 
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B.8. Lemma. Every Zermelo universe M is a natural universe of axioms (I) - 

(VI). 

Proof. We consider in turn each of the axioms (I) - (VI) other than (I) just 
shown and (IV) for which we gave the argument in the example above. The 
reader is advised to compare the relativization of each axiom to M which we 
must prove with the original statement of the axiom in Chapter 3. 

(II) < M) Emptyset and Pairset: (a) There is a special object y G M which is 
a set but has no members in M. (b) For all x. y G M, there is a set z G M such 
that 

(ft G M)[t G z [t = v V t = y]\. (B-9) 

(a) I £ M by hypothesis, it has no members whatsoever, so it certainly has 
no members in M. (b) If x,y G M , then z = {x, y} G M by hypothesis, and 
it obviously satisfies (B-9), since it satisfies the stronger 

(V?)[r g z [t = x v t = y]]. 

(HI) (AT) Sgp arat i on 

Axiom: For each set A G M and each unary, definite 
condition P, there exists a set B G M which satisfies the equivalence 

{fix G M)[x G B <£=> x&A&.P{x)]. (B-10) 

Suppose A € M. By the Axiom of Separation, there exists some B which 
satisfies 

fix)[x&B <==> xeA&P(x)]. (B-ll) 

Now B C A, so B e V(A) e M, so B e M because M is transitive, and 
(B-ll) implies the weaker (B-10). 

(V) (*r) pjnionset Axiom: For each set W € M, there exists a set B € M 
which satisfies the equivalence 

fit G M)[t G B fiX G M)[X G g G X]]. (B-12) 

Again, we naturally take B = (J IS , which is in M by hypothesis and satisfies 
the equivalence 

(Vi)[ig« ^ (3I)[lGr&IGl)]. (B-13) 

Using once more the transitivity of M, immediately, for every t, 

{3X)[X Gf&(Gl] fiX G M)[X G g? & f £ X], 

so that (B-12) reduces to 

fit G M)[t G5 fiX)[X G W & t G X]], 
which is implied by the stronger (B-13). 

(VI) ! V/ I Axiom of Infinity: There exists a set I G M such that 

0 G / & fix G M)[x G / => {x} G I]. 

This is quite simple, taking / = No G M. H 
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B.9. Theorem. (1) Every Zermelo universe is a natural universe of ZDC, and 
every ZFDC -universe is a natural universe of ZFDC. 

(2) (AC) Every Zermelo universe is a natural universe of ZDC+AC, and 
every ZFDC-universe is a natural universe of ZFDC+AC. 

In particular, Z and every M(I) such that No C / are natural universes of 
ZDC, or universes of ZAC, granting AC. 

(3) (von Neumann) (AC) The von Neumann universe V is a natural universe 
of ZFC. 

Proof. The relativizations of DC and AC were proved in 11.23 and the 
relativization of the Axiom of Replacement is exactly the defining condition 
of a ZFDC-universe in 11.33. For Part (3), we have already shown that V is 
a ZFDC -universe in 11.34, so it only remains to prove the relativizations to V 
of the Principles of Purity 3.25 and Foundation 11.29. The first of these is 

Purity^: For every x G V, Set(x), 

and it is true simply because every object in V is a set, and the interpretation 
of “sethood” in V is the standard one. For the second, it is easiest to relativize 
the elementary version of the Foundation Principle in 11.30. 

Foundation^ ' : For every set X € V, there exists some m e X such that 
m n X is empty in V, i.e., 

(Vf € V)[? £ mV t £ X], 

The negation of this would give us an X € V and some a € X such that 
(V/77 e X)(3? € X)[t € m], 

which by DC (as in the proof of 11.30) implies that X is ill founded, contra- 
dicting the assumption X e V. 3 

The Soundness of Logical Inference B.5 combines well with B.9 to yield 
many simple but interesting independence results about ZDC and ZDC+AC: 
to prove that a proposition 0 cannot be a theorem of ZDC, it is enough to find 
some / 3No such that is false. This is exactly the way we proved B.2, 

a bit clumsily without the precise notions. We have included in the problems 
several examples of this kind. 

By the same reasoning, Part (3) of Theorem B.9 implies that we cannot 
refute in ZFAC the Principles of Purity or Foundation, because V is a model 
of ZFAC which satisfies these principles. It should also be obvious that we 
cannot prove these principles in ZFAC, but to establish this rigorously we need 
to construct universes of ZFAC which are not natural. The basic tool for such 
constructions is the next simple notion. 

B.10. Definition. A Rieger universe is any set universe M = M, S, E such that 
for every set Y C M, there exists exactly one Ad-set X satisfying Y = h M (A). 
For each Y C M we set 

Pm(Y) =df the unique X e S such that b M (X) = Y, 


(B-14) 
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so that pM is a definite operation and immediately from its definition, for 
every F C M, 

X = p M {Y) ^ X &S8ib M {X)=Y (B-15) 

t E p M ( Y ) t £ Y. (B-16) 

B.ll. Rieger’s Theorem. Every Rieger universe is a universe of ZFDC. 

Proof. Fix M = M, S, E with the Rieger property and let b(x) = h M (x). 
skipping the subscript since M is the only universe around. We verify in turn 
the relativizations of all the axioms of ZFDC. 

(I) (-M) £ xtens j ona jj t y Axiom: For all A.B G M , if S(A) and S(B), then 

A = B <=> (\/x G M)[xE A ^ xEB], 

If A = B, then surely, for all x, xEA xEB. Conversely, if for all 

xGM,xEA x EB, then b(A) = b(B); by the Rieger property, there 

is exactly one C G S such that b(A) = b(C); so we must have C = A, and 
similarly C = B, hence A = B. 

(II) (AT) 

and Pairset: (a) There is a special object y G S such that 
(Vt € M)->tEy. (b) For all x,y G M, there is some z e S such that 


(V? G M)[tEz 


t = x V t = y]. 


For (a), choose y so that y G S and b(y) = 0, and for (b) choose z G S so 
that b(z) = {x,y}, both times by applying directly the Rieger property. In 
the case of (b), for example, we compute: 

t E z <=>■ t G b(z) •£=>■ [t = x V t = y], 

which is the required conclusion. 

(IV) (ad p owerset Axiom: For each set A G M, there exists some B G S, 
such that for every X G M, 

XEB X £ S & (V? G M)[t E X => t E A], 

Given A G M, choose B G S by the Rieger property so that 

b[B) = {p(Y)\Y Cb(A)}- (B-17) 

where p{ Y) = p M ( Y ) is the Rieger operation associated with M by (B-14), 
and compute: 

XEB X G b(B) 

4=4- (3 Y)[Y Qb(A)8cX = p[Y)\ 

A=* (3F)[F C b(A) 8c[X g S&h(X) = F]] by (B-15) 
t^JG S&b(X ) C b(A) 

X G S&(Vt G M)[tEX=>tEA]. 

Verifications of the remaining axioms in M are similar, following the same 
ideas as in the proof of Theorem B.9, and we leave them for the exercises. 3 
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B.12. Exercise. Prove that a set universe M = M, S, E is a Rieger universe if 
and only if( 1 ) it satisfies the Axiom of Extensionality, and (2) for every Y C M , 
there exists some X e S such that b M (X) = Y. 

B.13. Exercise. Every Rieger universe is a model of the Axioms of Separation 
(III) and Replacement (VIII). 

B.14. Exercise. (AC) Every Rieger universe is a model of ZFAC. 

B.15. Relativization of “faithfully modeled” notions. The Choice Principles 
DC and AC are formulated in terms of the notions of “function” which was 
defined in Chapter 4 using an “arbitrary but fixed” ordered pair operation 
4.1, and “system of natural numbers”, which was also “arbitrary but fixed” in 
5.9. It is not completely obvious how to relativize propositions involving such 
“faithfully modeled” notions, since (for example) a given universe M may not 
be closed under the chosen ordered pair operation. We avoided the problem in 
11.23 and B.9 by assuming that the fixed ordered pair is the Kuratowski pair 
under which every universe satisfying (I) - (VI) is closed, but there may be 
some lingering vagueness on how to deal with this problem in general. There is 
an easy solution for Rieger universes, which we outline in Problems xB.13 and 
xB.14. In discussing Rieger universes from now on, we will assume tacitly that 
we have fixed an ordered pair operation, a system of natural numbers, etc. for 
each of them, and we will relativize propositions which involve these faithfully 
modeled notions in terms of these fixed operations. Problems xB.13 and xB.14 
make it clear that which particular definitions are chosen is irrelevant for the 
results. 

B.16. Proposition. There exists a Rieger universe Mat in which every set is 
equinumerous with a set of atoms ; in particular, we cannot prove in ZFDC that 
every set is pure. 

Proof. The idea is to code every set A by the pair (0 ,A) in M s t, and to 
declare that every object which does not code a set in this way is an atom. We 
set 


x € M at ^=>df x = x, so M at = W, 
x € S at *' == ^ > df (3A)[Set(A) &x = (0.A)], 
x £ a t (0, d) ^=^df x G A, 

and proceed to verify that Mat = M at , S at , E at is a set universe, that it has the 
Rieger property and that it satisfies the proposition “every set is equinumerous 
with a set of atoms”. 

Skipping the subscript M a t, clearly 

, , _ f 0, if for all sets A, x (0 .A), 

\ A, if for some (necessarily unique) set A, x = (0, A), 

so each b(x) is a set. For the Rieger property, we notice first that for each 
set A C M at , (0. A) € Sat and A((0.A)) = A: if x € S and b(x) = A, then 
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x = (0, B) and b(x) = B for some set B, by the definition, so we have A = B 
and x = (0, A), as required. The Rieger operation is very simple in this case, 

p(Y) = (0, Y). 

The relativization to M a t of the proposition “every set is equinumerous 
with a set of atoms” is the assertion that for every M at -set (0, A), there is a 
bijection in Mat between (0. A) and some Mm- set of atoms (0. B). We take 

5=df {(U) \t£A}; 
now (0, B) G Sat and, by the definition, 

xE at (0, B) => X G B => (Vy)[x f (0, y)] => x £ 5 at , 

i.e., each _M at -member of (0, B) is an atom in M at . If (x, y) at is the ordered 
pair operation of Mat and 

/ =df p{(t, (k0)at | t G A}, 

then recognizes / as a bijection between the Ad a t-sets (0. A) and (0, B).A 
By the anthropomorphic “M a t recognizes f . . . ” we simply mean that if 
A' = (0. A) and B' = (0. B) are the objects in M at which code the sets A and 
B, respectively, and if we set 

9 4=>df / Cdx B&(\/x G A’){3y G B')(x,y) g / 

&(Vy G B')(3\x G A')[(x,y) g /]. 

then the relativization 0^“) of the proposition 0 to A4 at is true. This rel- 
ativization can be computed in principle, but it is quite messy. It is best to 
develop a machinery for arguing about relativizations without actually writing 
them out, and for this the following, traditional “model theoretic” notation is 
very useful. For each set universe M and each proposition 9, 

M \= 9 4=»df 9 is true in M 9^ M \ (B-18) 

We read M \= 9 “M models 9”, but also “M thinks that 9”, “M believes that 
9”, etc. as befits the occasion. For example, for each pair of classes M, S and 
each binary condition E, let 

Setuniv(M, S, E ) 4=>df v)[uEv => u, v G M] 

&(W)[t G S=>t G M] 

& (Vx)[x G M => 

(3JST)[Set(JSr) & (V/)[? G X *=> tEx]] 

be the fairly complex proposition which asserts that M, S, E comprise a set 
universe. Consider also 

Rieger (M, 5, E) ^> df (VT)[(Set(F) &Y CM) 

=>(3\X eS)[Y = b M (X)]] 

(VT)[(Set(T)& Y c M) 

=> {BIX G S)(W G M)[t G Y 


tEX]], 
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which asserts that At is a Rieger universe. These are propositions about M , 
S and E, which may be true or false; whether they are true or not. it makes 
sense to interpret them (relativize them) in some universe M' and ask if M! 
thinks that M, S, E comprise a Rieger universe! There is an obvious question 
here, which has a simple and useful answer. 

B.17. Theorem. Suppose Mi = M\,S\,E\ is a Rieger universe, S 2 C M 2 C 
M\ are classes and E 2 is a binary condition such that xE 2 y => x, y € M 2 ; if 

M\ \= Setuniv(M2,S2.f?2)&Rieger(M2.S , 2,£'2) . 

then M 2 = M 2 , S 2 . E 2 is also a Rieger universe. 

Proof. To show first that M 2 is a set universe, we must verify that for every 

x G M 2 , 

(3 7 G Set)(V?)[t G 7 t G M 2 &tE 2 x], (B-19) 

Fix some x G M 2 . The proposition (B-19) is true in M\ since 
Mi |= Setuniv(M 2 , S 2 . E 2 ), 

which means that some Y t G S\ exists such that for all t G M\, 

tEiYi t G M 2 &tE 2 x\ (B-20) 

and since Adi is a set universe, there exists some set 7 such that for all t, 

t G 7 t G Mi&tEiYi; (B-21) 

now (B-20) and (B-21) together imply what (B-19) demands for the given 
x G M 2 and some 7. 

To show that Al 2 is a Rieger universe, we must show hrst that if 7 C M 2 , 
then there exists some X 2 G S 2 such that for every t G M 2 , 

t G 7 tE 2 X 2 , (B-22) 

and then verify that this X 2 is unique. Since 7 C M 2 C Mi and A1 1 is a 
Rieger universe, there exists some h GSi such that for all t G Mi, 

t G 7 ^ tEiXp, (B-23) 

working in A1 1 . we apply the Rieger property for Al 2 to the A1 1 -set X\ , to get 
some X 2 G S 2 such that 

Mi h (W)[f G Mi ==>■ [f G Xi ^ t E 2 X 2 ]\: (B-24) 

and computing the relativization. this means that 

(Vf G Mi)[t G Mi => [t Ei Xi <$=> tE 2 X 2 ]]. (B-25) 

Compute now. for any t G M 2 C Mi : 

(G 7 t Ei Xi by (B-23), 

<<=► tE 2 Xi by (B-25), 

which proves (B-22). The fact that at most one X 2 G S 2 can satisfy (B-22) for 
every t € M 2 is proved similarly. 3 
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Remark: Notice that in relativizing (B-19) in this proof, we left the clause 
t G Mi alone. In computing relativizations, “primitive” propositions of the 
form P(x i, . . . , x n ) (which express that a definite condition P holds of the 
objects xi , . . . , x„) are their own relativizations. 

This simple theorem makes it possible to construct universes within uni- 
verses within universes, each time using the properties of the model just con- 
structed. Consider, for example, the next Corollary of B.16. 

B.18. Proposition. There exists a ZFDC -universe M which has exactly one 
atom. 

Proof. Suppose c is an atom and let 

M c = d f {x | (W G TC(x))[-.Set(0 => t = c]} (B-26) 

be the class of objects supported by {c } in the sense of Problem 11.38. It is 
quite easy to verify that M c is transitive and for each set X, 

X C M c => X G M c , 

so the natural universe M c has the Rieger property and by B.ll it is a model 
of ZFDC; and it is quite clear that it has exactly one atom, c. 

So far so good, as long as there exists at least one atom, which may or may 
not be true in our intended domain W. But there are lots of atoms in the 
Rieger universe M at of B.16, so what we need to do is to interpret the proof of 
the preceding paragraph in the universe M a t- This argument runs as follows. 
Let 


(f>(M, S. E. c) 4=>df Setuniv(M S,E)& Rieger(M. S. E) (B-27) 
& Atom(c) & (V? G M)[t S => t = c] 

be the proposition which asserts of M. S. E. c that they have the properties we 
are interested in, and let 

9 ^n{3M){3S)(3E){3c)(l>{M,S,E,c) (B-28) 


be the proposition which asserts that some M, S. E. c with these properties 
exist. We have proved 6 from the hypothesis that some atom exists, and other 
than that we have only used the axioms of ZFDC — what else is there! Hence 
this 6 is true in every universe of ZFDC which has an atom, in particular M- dt , 
i.e., 

Mat H #• 

This means that for some M a t -classes M, S, some binary M at -condition E 
and some Abt-object c, 

M a t b 4>(M,S,E,c), 


and in particular 


M a t b 


Setuniv(M. S,E)& Rieger(M, S, E ) 
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Figure B.l. Two decorated, ill-founded graphs. 

so by B.17, M = M, S, E is a Rieger universe. In addition 

M at b Atom(c) & (Vt G M)[t ^ S =» t = c] , 
which means precisely that 

M b Atom(c) & (Vt)bSet(t) t = c], 
the required conclusion that M believes that exactly one atom exists. H 
It is not hard to manufacture Rieger universes with various types of ill 
founded sets, by a combination of the techniques in B.16 and B.17. Some 
of the problems are about such results. Here we will concentrate on the 
construction of Aczel’s Antifounded Universe A, which has a rich variety of ill 
founded sets with well understood structure. 

The idea for A comes from the Mostowski Collapsing Lemma 11.36, which 
gives a “structural” characterization of pure, grounded sets. Recall that by 
11.35, a decoration of a graph G is any surjection d : G — » d[G] such that 

d(x) = {d(y) | y <— x} (x € G), (B-29) 

where — > is the edge relation on G and <— is its inverse, 

y <— x y is a child of x <=>• x — » y. 

Each grounded graph G admits a unique decoration d G , and the pure, 
grounded sets are all the values d G {x) of these decorations. Can we also 
“decorate” the nodes of ill founded graphs to get pure, ill founded sets which 
are related to ill founded graphs in the same way that pure, grounded sets are 
related to grounded graphs? 

B.19. Antifoundation Principle, AFA. Every graph admits a unique decoration. 

In Figure B.l we have labeled the nodes of two ill founded graphs by the 
values of their unique decorations, assuming that such exist. By the definition 
of decoration, 

f2={Q}, £V={0,Q 2 }, f2 2 = {fl 1 }- (B-30) 

i.e., Q, Q 1 and Qr are the “ultimately frustrating gifts” we discussed in 11.32. 
We can refer to “the” frustrating gifts, because, in fact, the equations in 
(B-30) — or the graphs in Figure B.l — characterize these sets under AFA, as 
follows. 
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B.20. Proposition. (AFA) (1) There is exactly one set Q which is its own sin- 
gleton. (2) There is exactly one pair of sets Q 1 , Or such that Q 1 = {0, O 2 } and 

0 2 = { 0 ‘}. 

Proof. (1) If X = {X} and Y = {7}, then we can use either X or Y 
to decorate the single node graph in Figure B.l: but this graph has only one 
decoration by AFA, so X = Y. The proof of (2) is similar. H 

This “uniqueness” part of the Antifoundation Principle we just applied 
makes it possible to specify and analyze the structure of ill founded sets with 
diverse properties, and is the main advantage of the antifounded universe A 
over other models which contain ill founded sets. We now proceed to its 
construction. 

B.21. Definition. A pointed graph is a pair ( G . p G ) of a graph and a node in it, 
in full detail, a structured set (G, -+ G ,p G ) where p G G G and — > G is a binary 
relation on the field G. The designated node p G is the point of the pointed 
graph. 

B.22. Pictures. A pointed graph (G, p) is a picture of a set A, if there exists a 
decoration d : G — » d[G] of G such that d G (p) = A. The canonical picture of 
a pure set A is the pointed graph (TC(A), 9, A), where TC(A) is the transitive 
closure of A and 9 is the restriction of the inverse membership condition to 
TC(A). This is a picture of A, because the identity function d(x) = x is 
obviously a decoration of it, A G TC(A) and d (A) = A. 

B.23. A bijection n : G >-» H between two graphs is an isomorphism if it 
respects the edge relations, 

v >g y n { x ) -> H n { y ) ( x,y G G); 

and an isomorphism between two pointed graphs (G, p) and [H.q) is a graph 
isomorphism n : G H such that n( p ) = q. We call G isomorphic with H if 
there exists an isomorphism n : G >-» H of the appropriate kind. 

It is easy to construct non-isomorphic pointed graphs which picture the 
same set, even grounded ones, e.g., see Figure B.2 where we have labelled 
the nodes with the values of the unique decorations. On the other hand, we 
would expect that if a pointed graph (G. p) admits a unique decoration d G , 
then the set A = d G [G] captures some important invariant of ( G,p ). The 
next fundamental definition identifies that invariant. 

B.24. Definition. A relation R C G/H is a bisimulation between two pointed 
graphs (G, -* G .p G ) and (H. — >#. p H ), if it relates the points, i.e., p G Rp H and 
also satisfies the implication 

xRy => (Vw g<— x)(3v h<— y)uRv 

&(Vu h<— y)( 3m g 4- x)uRv. 


(B-31) 



240 


Notes on set theory 


« {{ 0 }} 

bm 

C 0 d 0 


« {{ 0 }} 

P {0} 7 {0} 


5 0 


Figure B.2. Non-isomorphic, bisimilar, grounded graphs. 

Two pointed graphs G, H are bisimilar if some bisimulation between them 
exists, 

G =bs H 4=>df (3i? C G x H)[R is a bisimulation]. (B-32) 

As usual with structured sets, we will often refer to “a pointed graph G”, 
skipping the explicit reference to the edge relation — > G or the point when it is 
obvious or irrelevant — we already did this in (B-32). 

B.25. Exercise. Isomorphic pointed graphs are bisimilar , and so are the non- 
isomorphic , grounded, pointed graphs G and H in Figure B.2. 

B.26. Exercise. If a is a minimal node in a graph G and b is a minimal node in 
H . then {{a, b)} is a bisimulation of the pointed graphs (G, a) and (H, b). 

B.27. Exercise. Let L be the “single loop ” graph on a singleton { a } , with the one 
edge pair (a, a), and on the set of integers N define the successor edge relation 

n —> s m n + 1 = m. 

Show that the relation {(a, i) \ i 6 N} is a bisimulation of (L, a) with (N, n), 
for every n. 

B.28. Lemma. The condition =b s is an equivalence condition on the class of all 
pointed graphs. 

Proof. Of the three properties of an equivalence condition (defined in 
12.42), only the transitivity of =b s is not immediate. To prove that, suppose 
G\, G 2 and G 3 are pointed graphs. R\ is a bisimulation of G\ with G 2 and If 
is a bisimulation of 63 with G 3 , and let 

xRz *t=>df (3j)[xf?i y&yR 2 z\ (B-33) 

be the “product relation” of R\ and R 2 . It is clear that R relates the points p\ 
and pi of Gi and G 3 , because p\ Ri p 2 and p 2 R 2 p 2 hold. Suppose that xRz, 
so there exists some y G 63 such that v R \ r and y R 2 z. If it i<— x, then by 
(B-31) for R i, there exists some v 2 <— y such that uR\v\ and then by (B-31) 
for R 2 , there exists some w 3 <— z such that vR 2 w, which together with uR\v 
establish uR w. This is half of ( B-3 1 ) , and the other half is equally easy. H 

With these definitions, we can now prove that for a grounded graph G and 
any p e G, the properties of {G.p) coded by the value d G {p) of its unique 
decoration are exactly those preserved under bisimulation of pointed graphs. 
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Figure B.3. Many pictures of Q : (L, a) and each (N, n). 

B.29. Theorem. For all grounded graphs G and H with associated decorations 
do and du , and for all p £ G and q £ H, 

doip) = d H {q) (G, p) =bs (H, q). (B-34) 

Proof. We verify first that the relation 

R = {{x,y) £ G x H | do{x) = d H {y)} (B-35) 

satisfies (B-31), as follows: 

xRy ==> {d G (u) | u G < — x} = {d H {v) \ v >’} by the def. of decoration 
=> (Vm g <- x)(3v h*~ y)[d G (u) = d H [v)] 

&(Vn H <- y){3u G < x)[d G (u) = d H (v)] 

=> (Vm x)(3n n<— y)uRv &(Vv h<— y)(3u <j<— x)uRv. 

Flence. if p e G, q £ H. and d G (p) = dn(q), then the relation R of (B-35) 
establishes that (G,p) =b s ( H,q ). 

For the converse, suppose towards a contradiction that p is minimal in G 
such that there exists some q £ H and a bisimulation R of the pointed graphs 
(G. p) and {H, q), but d G {p) f dn(q)- Now 

(Vm p)(3v //<— q)uRv, 

hence. 

(Vm g <- p){3v h <- q)[d G {u) = d H (v)] 
by the choice of p, and. similarly, (Vn h*~ q)(3u p)[d G (u ) = d H {v)], 
which proves d G (p) = dn(q), contradicting the choice of p. H 

It is a crucial property of AFA that it yields the same characterization of 
bisimulation for all graphs, by quite a different argument. 

B.30. Theorem (Aczel). (AFA) For all graphs G and H with associated deco- 
rations d G and dn. and for cdl p £ G and q £ H , 

d G (p) = d H (q) ( G,p)= bs (H,q ). (B-36) 

Proof. The left-to-right implication in (B-36) is proved exactly as in B.29, as 
that part of the argument did nor depend on the given graphs being grounded. 

Suppose now that the edge relations of G and H are — > G and -+ H and 
R C G x H is a bisimulation of G with Ft. We can turn R into a pointed 
graph, with point the pair (p G , p H ) and edge relation the product of -* G and 
—>h'- 

(p, q) ->■ (w. v) p u&q -> H v. 
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If d G is the unique decoration of the graph G (forgetting the point) given by 
A FA. define on R the function 

d G {p,q) =df d G {p), 

and compute: 

x G d G (p,q ) x G d G (p) 

(3m g <— p)[x = d G {u)] 

(3u G <— p)(3v h<— q)[uRv &x = d G (u)] (B-37) 
<=> (3 (u,v) <- (p,q))[x = dg(u,v)], 

where the key equivalence (B-37) holds because R is a bisimulation and pRq, 
and hence for each u G <— p, there exists some v q satisfying u R v. Thus, 
the function d§ is a decoration of R, and the corresponding extension 

dffip. q) =df d H (q) 

of the decoration d H of H is also a decoration of R , by the same argument. 
By AFA then, for all (p, q) G R. 

d G {p) = d§{p, q) = d§{p. q) = d H {q), 
which completes the proof. 3 

This characterization under AFA of the properties of (G,p) which are 
coded into the value d G (p) of its unique decoration, suggests a method for 
the construction of A. 

B.31. The Antifounded Universe. Let 

-do =df {(G, ~^g,Pg) € V | — C G x G &. p G £ G} (B-38) 

be the class of all pointed graphs on pure, grounded sets, and on Ao define 
the binary definite condition 

( G,p G )e 0 {H , p H ) 

•£=>df (3 q G H)[q u<— p H & (G, p G ) =bs ( H , ^r)], (B-39) 

skipping the edge relations in the notation. 

First we note that eo respects bisimulation: 

G\£()H\ & Gl =bs G 2 &H 1 — ~bs H 2 => G2£qH2- (B-40) 

To prove this, suppose — > 1 , p \ are the edge relation and the designated node 
of H 1 , and similarly with — > 2 , P 2 for H 2 . The hypothesis of (B-40) gives us 
some q\ \<— p\ such that 

G 2 =bs G\ =bs ( H\,q\ ), (B-41) 

and a bisimulation R of ( H\,p\ ) with ( ll 2 - p 2 ). By the basic property of 
bisimulations, there must exist some q 2 2 <— H 2 such that q\Rqy, this means 
that R is a bisimulation of (. H\,q\ ) with (. H 2 ,q 2 ), and then (B-41) with the 
transitivity of = bs gives 63 = bs (H 2 , q 2 ), hence G 2 £qH 2 . 
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Now each G G Ao is a pure, grounded set (a triple in V), even if it is 
ill founded as a graph, and the bisimulation condition = bs is an equivalence 
condition on Ao by B.28. By Problem xl2.45, there exists a definite operation 
a which is determining for = bs , i.e., for G. II G Ao 

G = bs H ^ a(G) = a{H). 

The domain of the antifounded universe is the quotient class of Ao by = bs , 

A = df {a(G) | G G A}- (B-42) 

We define on A the membership condition 

xey *^==4>df (3G, H)[x = a{G) & y = a(H) & GsqH], (B-43) 

unambiguously by (B-40), and hnally we take the Pure Antifounded Universe 
to be the triple A. A, e . We will refer to it by the name of its domain A. which 
is also the collection of its sets — there are no atoms in A. 

B.32. Theorem (Aczel). (AC) A is a Rieger universe, which further satisfies the 
Antifoundation Principle AFA, the Axiom of Choice AC and the Principle of 
Purity. 

Proof. The key property of A is that for each graph H G V with edge 
relation —* H and each node p G H , 

h A (a(H, p)) = {a(H, q) \ q H <- p}, (B-44) 

which follows from the following trivial computation: 

x G b A {a{Pf. p)) (3G G A 0 )(3</ p)[x = a(G) & G = bs ( H. , q)] 

<=3- (3 q H 4 - p)[x = a(H, q)]. 

This implies, in particular, that each b A (x) is a set, so A is a set universe. 

For the Rieger property, suppose Y C A and (using AC) choose for each 
y € Y a pointed graph G y G Ao, such that (1) a(G y ) = y. By replacing 
each G v by an isomorphic copy if necessary, we can also ensure that (2) 
y f z => 6 V n = 0, and (3) for all y € Y, 0 ^ G y . Let 

H =df [}{G y | ve T} U {0}. 
u —>h v -^=>df (3v G Y)[u -*,«V(h = 0&« = p y )], 

where —* y and p y are the edge relation and the point of G y . The pointed 
graph H with edge relation — and point 0 is obviously in Ao, and for each 

y G Y, 

( H.p y ) = bs G y 

by the trivial (identity) bisimulation 

{{u, v) G H x G y | u — u}; 


(B-45) 
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thus, by (B-44), and the definition, 

b A {a(H, 0)) = {a{H, q ) \ q H ^~ 0} 

= {a(H,p y ) | y G F} 

= {a(G y ) | y G F} by (B-45) 

= F 

To prove the uniqueness of a(H), suppose //' is any pointed graph in At, with 
edge relation — and point q' such that 

GeqH' a{G) G F 

By (B-44) again, 

h A (ot(H' . q')) = {a(H\ q) \ q </'} 

= {«(//. ft,) I v € F} by hyp. 

Thus, for each y G F, 

v = a(G y ) = a(H. p y )ea(H'), 

and we can choose (by AC) some q y q' and a bisimulation S y of ( H , p y ) 
with (//'. q y ); and conversely, by the same argument, for each q q' we can 
choose some y q G F and some bisimulation T q of ( H . q v ) with (H' , q). It is 
now easy to verify that the union 

R = U {5, | Py * <- 0} u {T q I q V q'} U {(0, q')} 

is a bisimulation which establishes that H =b s H' , i.e., a(/f ) = a(H'). 

Finally, to verify AFA for A, suppose G is a graph in A with edge relation 
— »g G -4. To prove that G admits a decoration in A, it is enough to define an 
A-operation d such that 

A\=(Wp e G)[5(p) = {S(q) | q g < p}\ (B-46) 

since A is a ZFDC-universe, so it “knows” from (B-46) that the restriction of 
d to G is a function, which is then a decoration of G. Let 

H = df b A (G) G V, 

and make H into a graph in V with the edge relation 

X -► H y A \= x -> G y (x, y G H). (B-47) 

For each p g H, set 

dip) =df ol{H. p) G A {p G H) (B-48) 

and compute: 

h A id ip)) = {aiH, q) \ q H <- by (B-44) 

= {aiH, q)\A\=q g<— p} by (B-47) 

= {diq) | A b q G+~ P) by (B-48). 
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Put another way, for each peG, 

xeS(p) A \= (3q p)[x = 8{q)], 
which is equivalent to (B-46). 

It remains to show that G admits at most one decoration in A. and for this 
it suffices (as above) to show that ifd' is any „4-operation such that 

-4 b (Vp € G)[S'(p) = {S'(q) | q g< p}\, (B-49) 

then S' {p) = S(p), for every p G H . Given such a S', choose (by AC) a 
pointed graph H p with point r p for each p G H, such that 

S'(p)=a(H' p ) ( peH ). 

and make sure as in the proof of the Rieger property that these graphs are all 
pairwise disjoint. If H’ = (J {//' | p G // } is the union of all the graphs, then 

6 , {p)=a{H , ,r p ) (p € H), 

since (trivially) the identity relation {(q,q) \ q € H} is a bisimulation of 
{H' p , r p ) with ( H , r p ). We now claim that the relation 

R =df {{p, s)GHxH'\ a{H' , s ) = 5'{p ) } 

is a bisimulation of ( H , p) with [H ' , r p ) for each p G H . This will complete 
the proof, because for p g H , «(//'. r p ) = d'(p). hence pRr p , and hence 

d(p) = a(H , jp) = a(/f'. r p ) = d'{p). 

To show the somewhat less trivial half of the italicized statement, let — be 
the edge relation of H' , assume and compute: 

t s => a{H r . t)sa{H r , s) 

==> a(/f' ,t)ed'{p) becaus epRs 

=> for some q p, a{H' . t ) = <5'(^r) by (B-49) 

==> for some q p,qRt. 

The Axiom of Choice for A follows from B.14, and the Principle of Purity is 
trivial. H 


Problems for Appendix B 

xB.l. Prove that for each set universe M = M,S, E, the axioms for definite 
conditions and operations listed in 3.18 become true, if we replace in them 
“condition” by “A4 -condition”, “operation” by “A4-operation”, G by E, Set 
by S and (Vy) by (Vy G M). 

*xB.2. Suppose pairs and Cartesian products are defined by the Kuratowski 
operation of 4.2. Show that U,^{No}" ^ Z and infer that the following 
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proposition is not a theorem of ZDC: for each set A, there exists a function 
f : N — * f[A] such that 

f( 0) =AxA, f(n + 1) = /(«) x A. 

*xB.3. Construct a definite operation (x, y)' with the following properties. (1) 
(x, y)' is an ordered pair operation, i.e. , it satisfies (OP1) and (OP2) of 4.1. 
(2) If Z f G Z, then their Cartesian product X x Y is also in Z. (3) If 
\Jf =1 A n is defined using this pair, then for each A G Z, (J^ 2 € -Z. 

*xB.4. Show that the implication (OP1) => (OP2) in 4.1 is not a theorem of 
ZDC for any definite operation (x, y). 

*xB.5. Verify that if / is a transitive set. then 

A G M(I)=>TC(A) G M(I). 

xB.6. For each 7, define K n (l ) by the recursion 

Kq{I) = I, K n+l (l)=K„(l)U‘P(K n (l)). (B-50) 

Show that 

M n (I) C K n (TC n+ i(I)) CM (I), 

where TC„(7) and M„{I) are defined by (11-16) and (11-18). respectively. 

*xB.7. Find some 7 D N 0 such that TC(7) ^ 47(7). Infer that ZDC cannot 
prove that “every set has a transitive closure”. 

*xB.8. The implication (Cl) => (C3) in 4.20 cannot be proved for an arbitrary, 
definite operation \A\ in ZDC. (Cf. Problem xll.6.) 

*xB.9. The equivalence in Problem xll.7 is not a theorem of ZDC. 

* xB.10. Assume that the full Axiom of Choice and the Generalized Continuum 
Hypothesis is true, so for all cardinals k, 2 k = c k + . Prove that 

( n i— > H„) C Z, {ri i — ^ H„) ^ Z. 

Infer that ZDC+AC cannot prove the existence of an infinite, increasing 
sequence of infinite cardinals, i.e., the proposition 

9 : (3/ : N — > /[ N])(V« G N)[N < c f(n) < c f(n + 1)]. 

*xB.ll. Show that ZDC cannot prove the proposition “the well ordered set 
N of integers is similar with an ordinal”. Hint. Use Problem xl2.27. The 
less trivial part of the problem is how to compute (or avoid computing) the 
relativization of this fairly complex proposition. 

* xB.12. (AC) Show that ZFC cannot prove that strongly inaccessible cardinals 
exist. Hint. Go by contradiction and interpret the meaning of the alleged 
theorem in V K , where k is the least strongly inaccessible cardinal. 
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xB.13. An ordered pair operation in a Rieger universe M is any binary M- 
operation C such that for all x, y, x' , y' € M , 

C(x,y ) = C(x',y') x = x' &y = y' . (B-51) 

Cartesian products and function spaces relative to C are defined by 
A x c B =df p{C{x, y) | x E A, y E B}. 

(A B) = df p{f e M I {MtEf)[tEA x c B] 

& (\/x E A)(3\y E B)[C (x, y) E /]}, 

where p( F) is the Rieger operation of M defined in (B-14). Verify that these 
definitions make sense (i.e., p is applied to appropriate arguments) and hence 
A x c B and A —> c B are M -operations. 

*xB.14. Define triples, structured sets and systems of natural numbers in an 
arbitrary Rieger universe M, relative to an arbitrary ordered pair operation 
C (x, y) in M. Formulate the Choice Principles DC, AC N and AC using these 
notions and prove that every Rieger universe M satisfies DC, and if AC is also 
true, then M also satisfies AC. 

xB.15. Show that the Rieger universe A4 at of B.16 has an ordered pair oper- 
ation C such that for all x and y, the “pair” C (x, y) is an atom. 

*xB.16. (AC) Define a Rieger universe M which satisfies the following two 
propositions. 

(a) There exists a binary, definite condition < at which well orders the class 
of atoms, in the sense that (1) for all atoms a, b, c, 

a < a, [a < b&b < c] => a < c, [a < b & b < a] ==> a = c, 

(2) for every two atoms a, b, either a < b or b < a, and (3) every non-empty 
set of atoms has a < -least member. 

(b) Every set X is equinumerous with an <-initial segment of atoms, i.e., 
for some atom b, X = c {a \ Atom (a)&a < bj. Hint. Make the ordinals 
atoms in some Rieger universe. 

* xB.17. Define a Rieger universe which has at least two, distinct self-singletons, 
i.e., sets a and b such that a ^ b, a = {a} and b = {b}. Hint. Start with a 
universe which has two atoms and imitate the coding construction in B.16. 

* xB.18. Define a Rieger universe which contains an infinite sequence of distinct 
sets xo, xi, . . . , such that for each x,- = {x, + i}. 

xB.19. Given a graph G and a node p e G, let 

G\p = {x&G\x = pVp=>x} 

consist of p and all the nodes on a path below it. Consider G \ p as a subgraph 
of G, with the restriction of the edge relation — > G to it, and prove that 

(G,p) =bs {G \p,p). 
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B.33. Definition. A partial bisimulation between two graphs G and H is any 
relation R C G x H which is a bisimulation of the pointed graphs ( G . p ) and 
(H, q) for every (p, q ) G R: it is a total bisimulation if in addition 

(Vp G G)(3q G H)p Rq & (Vq G H)(3p G G)pRq. 

Two graphs are bisimilar if there exists a total bisimulation between them. 

xB.20. For all pairs of graphs G, H , there exists a largest (under C) partial 
bisimulation R between G and H , and G =b s H if and only if this largest 
bisimulation is total. 

xB.21. Two graphs G, H are bisimilar if and only if every pointed graph 
(G. p) with p G G is bisimilar with some ( II. q), q G H , and conversely, every 
(H, q) is bisimilar with some (G, p). 

xB.22. (AFA) Prove that there exist distinct, pure sets x, y and z such that 

x 9 y 3 z 9 x, 

and draw a picture of them. 

*xB.23. (AFA) Prove that there are only two, transitive, pure singletons. How 
many transitive, pure doubletons are there? Draw pictures of them. 

xB.24. (AFA) With the Kuratowski pair, prove that there exists a pure set x 
such that 

x = (0, x), 

and draw a picture of it. 

xB.25. (AFA) With the Kuratowski pair, prove that there exists a pure set x 
such that 

x = {(«, x) | n G N}. 


and draw a picture of it. 



SOLUTIONS TO THE EXERCISES IN CHAPTERS 1-12 


2 . 5 . The identity function (x i— > x) shows that A < c A. And if / : A >-> B 
and g : B >— > C, then the composition h(x) = g{f{x)) is an injection of A 
into C, so that A < c C. 

2 . 8 . The assumptions give A < c N and B < c A, so B < c N. 

2 . 9 . If A is empty, then B = /[0] = 0. If A is not empty, then there 
exists a surjection n : N — » A, and then the composition h(i) = f{n{i)) is a 
surjection of N onto B, so B is countable. 

2 . 20 . Suppose / : A >-» B is a bijection, and let n : V{A) — > V{B) be the 
image map. 

n(X)=f[X] (XCA). 

To prove that n is an injection, note that if x e X \ Y, then f(x) G f[X ] 
but it is not possible that f(x) G /[T]; because if f(x) = f{y) for some 
y G Y, then y = x since / is an injection, and so x = y G Y, which is not 
true. Thus 

x £ X \ Y => fix) G f[X\ \ /[T], 
and by the symmetric argument. 

y G Y\X^f{y)£f[Y]\f[X]. 

The two implications together show that 

A / Y=*f[X]?f[Y]. 
so that the image map is an injection. 

To see that the image map is also a surjection, given FC 5. let A = / _1 [F], 
so that, immediately f[X ] C F; but each v G F is the value fix ) of some 
x G A since / is a surjection, and then x = f~ l {y) G X, so that f[X ] = F. 

2 . 23 . Given bijections / : A\ >-» A 2 and g : B\ >-» B 2 , we let for each 
P : -> B|, 

7r(/>) = q where, for each x G A 2 , q(x) = g{p{f~ l {x))). 
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A i 


/ 


A 2 


P 

Bi 


g 


q = 7i {p) 

Bi 


Figure 1. Diagram for Exercise 2.23. 

In short: q is the unique function which makes the diagram in Figure 1 
commutative, i.e., such that 

g(p(x)) = q(f(x)) (. x G Ai). 

We include the details of the argument for completeness, but they can all be 
read from the diagram. 

To check that if p : A\ — > B\, then n{p) = q : A 2 — > B 2 , compute: 

x G A 2 =y f~ l (x) G A\ 

=>^(/ _1 W) e B{ => q(x) = g{p{f~ l {x))) G B 2 . 

The proof that n is a bijection is very similar to that of the corresponding 
Exercise 2.20 for powersets, only a bit simpler. 

First, if pi, p 2 : A< — > B\ and p\ A p 2 , then there is some v G A . such that 

Pi(y) + Pi{y)- If Jc = f{y), then 

P\{f~\x)) = pAy ) ^ p 2 {y) = pAf~\x)) 

using the fact that / -1 is an injection; and so applying g to both sides and 
using the fact that g is an injection, we get that 

n(pi)(x) A n{p 2 ){x), 

so that 7i(p\) A n{pi)- Thus n is an injection. To check that n is also a 
surjection, define for each q : A 2 — > B 2 the function 

p(y) = g~\q(f(y))) (je^i), 

so that p : A\ — > B\, and for each x G A 2 , 

n{p)(x) = g{p(f~\x))) = g{g~ l {q{f{f~ l {x))))) = q(x), 

so that n(p) = q and n is a surjection. 

3.9. We must show that these two sets do not have the same members, and 
this is true because 0 G {0} while 0^0. 

3.13. The empty set is a subset of every set, so 0 G P(0); and if X C 0, 
then X must be empty, since any member of X would have to be a member 
of 0 — and there are none such. For the second claim, it is again immediate 
for every set A that 0,ACA, and so {0, {0}} C P({0}); and for the other 
direction, if X C {0}, then X can have only one member, 0, and so X = 0 or 
X = {0}. 
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3.14. Using the powerset and subset axioms, we set 

B = {X&V{A) | (3teA)[X = {?}]}. 

3.16. The emptyset has no members, and so (J 0 = 0; and the only member 
of the singleton {0} is 0, which has no members, so again, (J {0} = 0. 

3.20. The unary condition P defined by 

P(x) ^==> x € A 

is coextensive with A, and so by (3-10), {x | x € A} = A. Similarly, if 
Q{X) Set {X)&XCA, 

then Q = e V{A), and so by (3-10), {X | Set(A ) &X C A} = V(A). 

3.21. That Set is not a set is part of Theorem 3.11. For W we argue by 
contradiction: if it were a set, so would Set be a set by the Axiom of Subsets, 
since Set = {x e W | Set(x)}. 

3.22. If A = {w | w is a singleton} were a set, then its unionset (J A would 
also be a set; but every object x is a member of its singleton {x}, and so 
(J A = W, which is not a set by the preceding Exercise. 

3.23. This is primarily an exercise in terminology, and we can prove it by a 
simple round-robin argument. We assume that A is a class. 

If A is a set, then A e {A} and the singleton {A} is a set, and hence a class, 
so A belongs to a class. 

If A belongs to some class B, then it must be an object (atom or set), 
since only objects are put into classes by (3-8); and since A is a class by the 
hypothesis, it is not an atom, and so it must be a set. It follows that A C A, 
and so A is a subset of a set. 

Finally, suppose that A is a class and A C X, for some set X. If A is a set, 
we are done; otherwise, by (3-10), A = P for some definite condition P, but 
A C X, so that only members of X can satisfy P and A = {x £ X | P(x ) }: 
and this makes A a set by the Subset Axiom, contradicting the case hypothesis. 

4.3. If Pair(z), then z = (x, y) for some x, y, and by the definitions, 

x = First(z),j> = Second(z); 

and if z = (First(z), Second(z)), then z = (x, y) with these x, y, and so 
Pair(x, y) holds. 

4.4. If (x, y, z) = (x, (y, z)) = (x', (>>', z')) = (x' y y', z'), then by the basic 
property of pairs we have x = x' and (y, z) = (y 1 , z’)\ and so applying the 
same property once more, we have y = y' and z = z'. The other direction is 
trivial. 

4.6. A l±l 0 = { (blue , x) \ x € A}, and ( blue , x) € A l±l B for every x £ A and 
every B. For the second claim, notice that for any A, B, the members of A l±l B 
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are ordered pairs; but if we take A = B = {0}, then (with the Kuratowski 
pair), 0 is not an ordered pair, and so we do not have {0} C {0} i±i {0}. 

4 . 9 . There is nothing much to prove for (2) either, since P a .b is a set by 
the Subset Axiom (III) and it is a subset of A x B by its definition — so it is a 
relation on A, B. 

4 . 11 . The claim is trivial for the first two examples. For the third one, 
the reflexiveness x ~ a /b x and symmetry x ~ a /b y=^y ~a/b x are again 
trivial. For the transitivity, observe first that 

[x ~ a/ b y & y = z] => x ~ A/B z; 

this is because the hypothesis tells us that either x = y, and then with y = z 
we have x = z, so that x ~ a /b or x,y G B , and so with y = z we have 
x, z G B, i.e., x ~a/b z again. Suppose now that x ~ a /b )’ and y ~ a /b z, 
and assume that all three x, y, z are distinct, since each of the assumptions 
x = y, y = z or y = z implies immediately that x ~ A / B z by the observation 
just made; but if x / y and y / z, the first hypothesis x ~ a /b )’ gives 
x G B&y G B, and the second hypothesis y ~ a /b z gives y G B & z G B, so 
that x G B & z G B, which implies x ~ a /b z - 

4 . 13 . If = A is the restriction of the identity relation on A, then each x G A 
is equivalent only to itself and so [ x/= A \ = {x}. For the universal relation 
in which x y for all x, y G A, clearly [x/~ A ] = A , for any x G A. 
Finally, for the more complex we have [x/^a/b] — {-v} if y G A\ B, 

and [x/^a/b] = B if x G B. 

4 . 15 . If / is a function, then there exist sets A. B such that / C Ax B, and 
so 

Domain)/) = {x G A \ (3v)[(x,y) G /]} 

and Domain)/ ) is a set by the Subset Axiom (III). The same argument shows 
that Image(/) is a set (a subset of B ), and the last implication is trivial. 

4 . 17 . The proof is exactly that in Exercise 2 . 20 , so the point is to read 
that argument carefully and see exactly what axioms are needed to justify 
it. Recall that the required equinumerosity between V{A) and V(B) was the 
image function 

7 z(X)=f[X] (XCA), 

so we must check what axioms are needed to construct this function as a set 
of ordered pairs. First, the values are sets by the Subset Axiom (III), since 

f[X\ = {y G B | (3x G X)[(x, y) G /]}; 

but this part of the argument uses an assumed ordered pair operation, and 
so we are implicitly appealing to all the axioms required to construct such an 
operation, i.e., those used in the proof of Lemma 4 . 2 . Looking carefully at that 
argument, it uses the Pair Axiom (II) to define the Kuratowski ordered pair; 
the Extensionality Axiom (I) to prove that this operation satisfies (OP1); and 
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to prove (OP2), first the Pair and Unionset axioms (II) and (V) to construct 
AS B and then the Powerset axiom (IV). In short, to make precise and to 
show that if / : A — > B, then the image f[X ] of any X C A is a set, we need 
all the axioms except for the Axiom of Infinity (VI) . 

To finish the proof we define n as a set of ordered pairs: 

n = {(V, Y) G V(A) x V(B) | (Vj)[j G Y «=► (3x G X)[(x,y) G /]]}. 

This is a set by the Subset Axiom (III) again, once we have the basic properties 
of the ordered pair operation as above. 

4.18. The detailed proofs for this are almost identical to the argument in 
Exercise 4.17 and it is not worth repeating them. We confine ourselves to 
defining the functions required to show these equinumerosities, assuming that 
/ : A >-» A 1 and g : B >— » B' . 

To show that A a B = c A' l±l B', we set 

^ f ( blue , f(x)), if i = blue . 

1 (white , g(x)), otherwise, i.e., if i = white . 

To show that A x B = c A' x B 1 we set 

n((x,y)) = ( f(x),g(y )). 

Finally, to show that (A — > B) = c (A' — > B'), we set for p : A — > B, 
n{p) = {(x,y) G A' x B’ \ y = g(p{f~\x)))}, 
following the proof in Exercise 2.23. 

4.22. Using successively (Cl), the hypothesis and (Cl) again, 

A= c \A\=\B\= C B. 


4.23. Given the two bijections / : A >—» \A\ and g : B >-» \B\, we define 
h : AU B A & B by 


h (x) 


( blue . f(x )), if x £ A, 

(white, g (a)), otherwise (i.e., if x G B \ A). 


If A n B = 0, then the two cases in the definition of h are mutually exclusive, 
and so h is a bijection which witnesses 


AS B = c AS B = c \A\ + \B\. 


4.24. These identities follow immediately from the definitions, the ba- 
sic (Cl) and Exercise 4.18: e.g., 

«i + Kj = c hi l±) k 2 = c h W fa (by Exercise 4.18) = c fa + fa. 

4.26. If k is a cardinal number, then k = \A \ for some set A, and by (Cl), 
k = c A: but then (C2) (which holds of a strong cardinal assignment) implies 
that |« | = \A\, i.e., |k| = n. For the non-trivial direction of the second claim: 
if re = c X, then |re| = |A| by (C2), and so re = X by the first claim. 
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Figure 2. Diagram for Exercise 4.28. 

4.27. By the definition, each member of K l±l (L l±l M) is in exactly one 
of these forms, where k, l, m are uniquely determined members of K. L, M 
respectively: 

x = ( blue , k), or x = ( white , ( blue , /)). or x = ( white , ( white , m)). 

In the same way, each member of (K l±l L) l±l M is in exactly one of the 
following forms, where k, l, m are uniquely determined members of K. L, M 
respectively: 

y = ( blue , (blue , k)), or y = ( blue , (white , /)), or y = ( white , m ). 

The required bijection / : i£l±l(Ll±lM) >— » (/fl±lL)l±lM is defined by matching 
the corresponding elements in the obvious way: 

/ ( blue , k ) = (blue , ( blue , k)), 

/(white, ( blue . /)) = ( blue , (white , /)), 

/ (white , ( white , m)) = (white , m). 

4.28. If L D M = 0, then ( L U M) x K splits into the two disjoint sets Lx K 
and M x K, and so each function / C (Lu M) x K splits into two functions 

f L = f C\ {L x K) : L —> K. f m = f Li [M x K) : M —> K. 

This defines a mapping n : ((L U M) — > K) — > (L — > K) x (M — > K) given 
by 

Af) = (/ l ,/ m ), 

and it is very easy to check that n is, in fact, a bijection. 

To get the cardinal identity (4-17) from this, use A l±) /i = L U M where 
L = { blue! x A and M = I white ! x ji. so that L n M = 0 and A = c L, 
ju = c M, and compute: 

K U+v) =c u M) —> K (def., (Cl) and 4.18) 

= c {L K) x (M — ^ K) 

= c K l -K fi (def., (Cl) and 4.18). 
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5.18. We prove by induction on n. that each function 

fn(s) = n + s 

is an injection. This is trivial at the basis, since fo{s) = s, and for the 
induction step. 

fn+i(s) = (n + 1) + s = (n + s) + 1 = f„(s) + 1, 

and so /„+ i is the composition of /„ and the successor function; both of these 
are injections, and so /„+ 1 is an injection. 

5.25. For the first claim, suppose that there is a k £ [n, n ), so that there 
exist t and s satisfying 

n + t = k. k + s = n, k ^ n: 

it follows that n+t+s = n, which by three applications of Exercise 5.18 implies 
that t = 0 and s = 0; and that gives k = n contradicting the hypothesis. 

The second claim is essentially a restatement of Lemma 5.22. First we 
compute; 

k < Sm ^==> k < Sm&k ^ Sm (def. ), 

4=> (. k < m V k = Sm) &k ^ Sm (by Lemma 5.22), 
k < m 

( k < m &k ^ in) V k = m 
k < m V k = m, 

where for the third equivalence we have used the fact that 

k < m => k ^ Sm ; 

this holds because its negation implies that Sm < m which is absurd (if 
Sm + t = m, then m + St = m. so St = 0 by Exercise 5.18, which is false). It 
follows that for all k and n < m, 

k £ [«, Sm) •<=>■ n < k & k < Sm n < k & (k < m V k = m) 

<=>• (n < k &k < m) V (n < k &k = m) ■<=>■ k £ (\n, m) U {w}j, 

the last inference using the hypothesis n < m. 

6.3. For each x £ P, we have x < M since M is the maximum; and if M' 
is any upper bound of P, then M < M' because M £ P. 

6.4. (1) {u, /} has no upper bound. (2) {a, c} has upper bounds {b, d, e 
and /) but no least upper bound. (3) {b, d } has a least upper bound (e) but 
no maximum. 

6.5. Every member of P is an upper bound of 0, because it satisfies (for- 
mally) the condition for being an upper bound, 

(Vx)[x £ 0=>x < M]: 
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it follows that M is the least upper bound of 0 if and only if it is the least 
element of P. 

6.6. The conditions 

A C A, [X C Y&Y C Z]=>X C Z, [X C F& F C A] => X = Y 

are all trivial. If g* C V(A) is any set of subsets of A, then the unionset (J g 3 is 
an upper bound of g 3 , because le? => X C |J g”; and it is the least upper 
bound, because for any M' , if X C M' for every X £ g 3 , then |J g” C M'. 

6 . 9 . If is is empty, then (A — > - E) = (A — > E) = {0} and there is nothing 
to prove, so assume that e 0 £ E. If f : A — >■ E and X = Domain(/), we set 

y/( x ) = //M’ if /( x )i> 

1 e\). otherwise, 

so that f'\A—>E and /' | X = /; and if / : A — > E and X C A, then 
f | X : A E with Domain(/ \X) = X, directly from the definitions. 

6 . 11 . The empty set satisfies (formally) the condition for being a chain, 

(Vx, y)[(x € 0 & y £ 0) => x < y V y < x], 

6 . 12 . The only chains in a fiat poset P are singletons (including {J_}) and 
doubletons of the form { J_, a}, and every one of these sets has a maximum, 
which is also its supremum. 

A discrete poset with at least two elements a, h. cannot be inductive, because 
it has no least element; and so the only inductive, discrete posets are the 
singletons. 

6 . 13 . If u, v £ {x„ | n £ N}, then there exist n.m such that u = x„ and 
v = x m \ if n < m, then u = x n < x m = v, and, similarly, if m < n, then 
v < u. 

6 . 15 . With the usual ordering, N is a chain and it has no upper bound; and 
the poset of Figure 6.2 does not have a least element. 

6 . 16 . We claim that if A C P = E* U (N — > E) and A is a chain, then 
the union set (JA is a function with domain some subset of N: because if 
(x, y),(x, y') £ (JA, then there exist p,q £ X such that (x,y) £ p and 
(x,y r ) £ q ; and if p C q, we have (x,y), (x,y') £ q so that y = y', while 
if q C p we have (x, y), (x, y') £ p, so that, again, y = y'. Moreover, the 
domain of (J A is either all of N or a finite, initial segment of N, because it is 
closed under <: 

x < x £ DomaindJ A) => for some p £ A, x £ Domain(^) 

==> x £ Domain(^) 

==> x £ DomaindJ A). 

Thus (J A £ P, and it is clear that it is the least upper bound of A. 
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6.17. Suppose S C (A >— >■ B) is a chain of partial injections and let p* = (J S 
be the unionset of S. Since p* is obviously the least upper bound of S under 
C , it is enough to show that p is a partial injection; i.e., we must show that 

(x, y), (. x , y') G p* =>y = y, (a) 

(x, y), (x\ y) G p* => x = x' . (b) 

For (a): from the hypotheses, there are p, p' G S such that (x, y) G p and 
(x, y') G p'\ if p < p' , then (x, y) G p' , and since p' is a partial function 
and it also contains (x,y'), we must have y = y'\ and if p' < p, then the 
analogous symmetric reasoning applies and leads to the same conclusion. 
The argument for (b) is similar. 

6.20. If xo <p x'i <p • • • , then the set X = {x„ « G N} of values of the 
sequence {x„}„ e N and its image n[X] = {7 r(x„) | n G N} under a monotone 
n are both non-empty, countable chains, an observation that was needed in 
order to define limits of monotone sequences in the first place, 

lim„ x n = sup X, lim„ 7r(x„) = sup n[X\. 

it follows that if n is countably continuous, then 

7i(lim„ x n ) = 7i(sup(X)) = sup7r[X] = lim„ n(x„), 

as required for one direction of the Exercise. For the other direction, suppose 

that S = {.so, .y 1 } is a non-empty, countable chain in P, and define the 

function (sequence) x : N — » S by the recursion 

x 0 = s 0 , x„+i = max P (x„, j„+i). 

The definition makes sense, since each x„ is a member of the chain S and 
hence comparable with s n+ [ . By a trivial induction on n. 

x n G S, s n tjp x n , x n ■■ /' x„+i, 

from which we get immediately that 

sup S = sup{x„ | n G N} = lim„ x„; 
and the same holds for the images by any monotone n, 

sup7i[5'] = sup{7r(x„) | « G N} = lim„ 7t(x„). 

Thus, if n respects limits of monotone sequences, we have the required 
sup7r[»S] = lim„ 7t(x„) = 7i(lim„ x n ) = ^[sup^ 1 ]. 

6.24. Directly from the definition, 

n{f){n) = n(f 0 )(n), 

where / 0 is the restriction of / to the finite set {0 n\. For the second 

part, if f{n) =2 n, then 

n(f)(2) = f( 0) + /( 1) + /( 2) = 0 + 2 + 4 = 6. 



258 


Notes on set theory 


6.26. If / : X — > Y is continuous and F C Y is closed, then G = Y\F is 
open, and F = Y \ G, so 

f~\F] = f~ l [Y\ G] = f~ 1 [Y]\f- l [G] = X\f- l [G], 

so f~ l [F ] is closed. The symmetric argument shows that if f~ l [F] is closed 
for every closed F C Y, then f~ l [G] is open for every open G C Y, and so 
/ is continuous. 

6.32. Directly from the definition. 

4/)(«) = n(fo)(n), 
where / 0 = {(«. 0)} if n G A , and if n ^ A, then 

/o = {( n,h(w )) | (n + l,to) e /}. 

This /o is finite, in fact it is defined on exactly one number if n £ A or 
n ^ A &f(n + 1) |, and it is the empty partial function otherwise; and so 
by 6.23, n is a continuous map. 

7.3. Suppose < c is a wellordering of C and / : A >—> C is an injection, and 
let 

X f(x) <c f{y) {x,y G A). 

We prove that the relation < A is a wellordering of A. 

Proof that < A is a linear ordering is trivial, basically by inspection. For 
example, 

x <a y&y < A z => f{x) < c f{y)&f{y) <c /(z) 

=> f(x) <c f(z) (because < c is transitive) 

=> x < A z (by defi), 

so that <a is transitive. 

To check that < A wellorders A, suppose X C A and X ^ 0, and let f\X\ 
be the image of X; now f[X] is a non-empty subset of C, and so it has a least 
element f{m)\ and then m is the <^-least element of X, because for every 
x G X, f (. m ) < c f (jc), and so m < A x, by the definition. 

7.4. If (C, <c) is a well ordered set and / : C — » A, define “an inverse” 
function g : A —> C by 

g(x) = min c {y \ f{y) = x}. 

This is an injection; because if x ^ x' , then 

{y\f(y) = x}n{y\f(y) = x'} = <fi, 

and g(x) is in the first of these two sets while g{x') is in the second. 

We now appeal to Exercise 7.3 using this injection. 

7.6. This is trivial: the restriction </ of < v to / inherits the properties of 
reflexiveness, transitivity and antisymmetry, and every non-empty X C / has 
a least member in / n U, which is its least member in I. 
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7.8. By the definition, seg(O) = {x € U \ x <u 0}, and so seg(0) = 0 since 
0 is the least element. 

Again by the definition, 

S(x) = min{y e U \ x <u y}, 

so that there is no z G U such that x <u z <u S(x); this means that 
y <u S(x) ■<=>■ y <u x V y = x, 
i.e., seg(S(x)) = seg(x) U {x}. 

7.10. If V is the set of all initial segments of U, then 

v= V P U {U}, 

where the “proper” initial segments in V p are those of the form seg ;/ (x ) with 
x € U. Moreover, the mapping 

/(/) = the unique x € U such that / = seg [/ (.\-) 

is obviously order preserving from V p into U (in fact it is a similarity) and so 
V p is well ordered. Now V is obtained by adding a point (U) at the top of V p , 
and so it remains well ordered — for a detailed proof of this, see Exercise 7.18 
below. 

7.12. Recall from Definition 6.18 that a mapping n : P — > Q is monotone if 

x < y =>■ n(x) < n{y), 

and so, it is order-preserving if, in addition, the converse holds, 

n(x) < n{y) => x < y; 

hence, order-preserving mappings are monotone. For a counterexample to 
the converse, take any constant mapping n{t) = c, with P chosen so that it 
has (at least) two members x ^ y; constant mappings are trivially monotone, 
but this one cannot be order preserving, since n(x) = n(y) = c, which would 
imply (if n were order preserving) that x < y and y < x, i.e., x = y. 

7.13. Suppose (towards a contradiction) that P and Q are linear, / : P — > Q 
is order-preserving, x < P y but /(x) ftq f{y)\ it follows by the linearity of 
Q that f{y) <q /(x); but this implies y <p x, since / is order-preserving; 
and this contradicts the hypothesis x <p y. 

For the converse, suppose (with P and Q linear again) that / is strictly 
monotone but not order-preserving. Easily, x <p y=> f(x) <q f(y), 
taking cases on whether x = y or x < P y, and so, since / is not order- 
preserving, there must exist x, y € P such that /(x) <q f(y) but x y; 
but then y <p x by the linearity of P\ and so / (y) <q f (x), since / is strictly 
monotone, contradicting the other assumption, /(x) <q f(y). 

7.14. The identity mapping 7t(x) = x on P is a similarity, so P = 0 P. 
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To verify the symmetry property, suppose n : P >-» Q is a similarity, and 
let : Q > — » P be the inverse bijection. Now, if x,y G Q, then 

7r _1 (.\') <p n~ (y) n(n~ l {xj) <q ^(^'(y)) 

since n is order-preserving, and the order-preserving property for n~ 1 follows 
because n(n~ l (z)) = z for every z € Q. 

For the transitivity property, we first observe that the composition of order- 
preserving mappings is order-preserving : 

x <p y <*=>■ n(x) <o n{y) p{n{x)) < R p{n(y)). 

Since the composition of bijections is a bijection, it follows that the composition 
of similarities is a similarity , and so =„ is transitive. 

7 . 17 . If n : P >-» Q is a similarity of P with Q, then its extension by one 
value p U {(/>, to)} is easily a similarity of Succ(F’) with Succ(£>). 

7 . 18 . It is easy to show that <succ(t/) is a linear order of Field ( t/) U {?c/}. To 
check the wellordering property, suppose 0 f X C Field(C/) U {tu}\ if X = 
{tc/}, then tjj is the least member of X in Succ (U), and if X n Field( U) f 0, 
then the < [/-least member of this set is the <s UC c(t/) -least member of X. 

7 . 28 . Let = pit be the composition of n and p , so that 

a(x) = p{n{x)) ( x G U). 

To prove that a : U — > W is order preserving, we compute, using the fact that 
n and p are order preserving: 

x <u y n(x) <v n(y) 

p(n(x)) < w p(n(x)) 

<=3- a(x) <w o(y)- 

To show that o[U] is an initial segment of W, supposes &a[XJ] = p[n[U]\, 
so that w = p{v) for some v G n[U]. and w' <w w. Since p is an initial 
similarity, there must exist some v' G V such that w' = p{v')\ and since 
p(v') <w p{v) and p is order preserving, we must have v' <y v. Since 
v G 7c[( 7], we have at this point some u G U such that 

v = n(u) and v' <v v. 

But n is also an initial similarity, so that (as above) there must exist some 
u' <u u such that v' = n(u'), from which we get 

ct(m') = p{n{u')) = p( v') = w' , 

which completes the proof. 

8 . 3 . There is no choice set S for a family W which contains 0, since S R 0 
is not singleton. For the second example, if S' is a choice set for W, then 
S fl {a} = {a} and also S n {/r} = {b}, so that {a, b} C S and S n {«. b} is 
not a singleton. 
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8 . 7 . Suppose (Vx)(3j).P(x, y), and let e be a choice function for B, so that 
[X C B&X ± 0]=> £ (X) G X. 

Define / : A — > B by 

fix) = e({y | P{x,y)})\ 

this makes sense since for each x G A, {y \ P(x, y)} ^ 0 by the hypothesis. 
Moreover, /(x) G {y \ P{x,y)} for any x G A, since e is a choice function 
for B, and so P{x, f (x)), as required. 

8 . 15 . Suppose a G A, P C A x A, and (Vx G A)(3y G A)P(x, y). Let 


P*(u,x) 


u G A* &x G A 


& 


[u = 0&x = a] V tlh(u) > 0&i > (w(lh( M ) — l),x)] 


If u = 0, then P*(u,a), and if u = («o , m„) with n > 0 and P(u n ,x), 

then P*(u,x); thus the hypothesis on P guarantees that (Vw)(3 x)P*{u,x), 
and the assumed version of DC supplies a function / : N — > A such that 
(\/n)P* if {n), /(«)); which in particular implies that /(0) = a, when we 
apply it to m = 0. Moreover, if n > 0. 

P*ifin),fin)) <*=>■ /’(/(« - 1), /(«)), 
so that (Vn > 0 )P(/ («— 1), / («)), which is the same as (V«)/ > (/ («), / (n+1)). 

8 . 18 . Suppose first that (P, <) is linear and grounded, and let X C P be 
a non-empty set of points; it follows that there is some m G X such that for 
all x G X, x ft m. which by the linearity of < means that m < x, and so 
m is the least member of X. Conversely, if (P. <) is a wellordering and X is 
any non-empty subset of P, then X has a least element m, which is certainly 
minimal in X. 

9.4. If u is a leaf, then u C w => w = u, so that T u = {w G T \ w C u}\ 
on the other hand, the second summand in (9-2) is empty (because u has no 
children), and so (9-2) holds. If u is not a leaf, then clearly, for every w, 

uQw (3u)[v is a child of u&v C w\, 
which again implies (9-2). 

9 . 5 . An infinite branch / : N — » N would need to satisfy 

/(0) >/(l) >••• , 

and there are no infinite, descending sequences of natural numbers. 

9 . 12 . In full detail, we know that k < c 2 k , and so k + < c 2 K by (9-5). Now 
the GCH asserts that no cardinal is properly between an infinite k and 2 K , so 
that k + = c 2 K : and, conversely, if k + = c 2 k , then there is no cardinal properly 
between k and 2 K , again by (9-5). 
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9 . 19 . If k\ and k 2 satisfy (1) - (3), then K| = c A for some A G S by (2) 
for k \ ; and so k 2 < c A < c k\ by (3) for k 2 . The symmetric argument gives 

Kl < c k 2 . 

9.22. This is immediate, from the theorem and the fact that for every i, 
{/} < c 2, since 2 is a set with two members. 

9 . 24 . This is immediate. 

10 . 2 . Suppose u C v and compute: 

x g M v => v c x u c x =» x e A f u . 

The converse is equally obvious. 

10 . 3 . The family of neighborhoods is equinumerous with N*, and N* = c N. 

10 . 5 . If G is open and x G G, then by (10-6), there is a u such that 
x G M u C G \ and if each x G G belongs to some A f u C G. then (10-6) holds 
trivially. 

10 . 9 . By definition, M u = [T„\. where the subtree T u is defined in (9-1) and 
it is splitting, since for each v G T u and each n, v *(n) G T„. 

10 . 13 . If A is uncountable and A = [J H A n( zu. then at least one A n must be 
uncountable; and if A„ G T, then A„ has a non-empty, perfect subset P by the 
hypothesis, and so also P C A. 

10 . 17 . If n : (N — ^ - N) — > (N - N) is continuous and x G N. then 
7r(x) = sup{7r(u) | v G N*,u C x}. 

and so if n{x) G A f u (which means u C n(x)), there must exist some v Q x 
such that u C n(v)\ but this implies that u C n(y) for any y G Af such that 
v C y, so that finally, 

y G Af v => n{y) G M u 

for this v, which establishes the continuity of / = n \M at x. 

10 . 26 . Call (temporarily) a collection of pointsets good if it is one of the 
collections whose intersection defines B{X), i.e., 

W is good •<=>■ Q C W 

& (V{^4„}„)[(Vu)^4„ G '§ G 

&(\/A G %)[cA G g”]}. 

To verify that B(X) = f) {% \ % is good}, is a cr-field which contains all the 
open sets (i.e., good), we must check the following three claims; 

(1) Every open set G C X is in B(X); this is because G G % for every good 
g\ and so G G B(X). 

(2) If A G B(X), then cA G B(X): if A G S’ for every good f”, then cA G S’ 
for every good S, by the definition of “goodness”, and so cA G B{X). 
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(3) If A„ £ B(X) for every n, then\J n A„ £ B(X). Fix some good collection 
W . The hypothesis tells us that each A„ £ 2r\ and so by the definition of 
“goodness”, \J„A n £ but g* was arbitrary, and so (J n A n belongs to every 
good collection, which means that (J „A n £ B(X). 

10.27. By Problem xl.3, f]„A„ = c({J n cA n ). 

11.4. By the Replacement Axiom, the image F[X] of any set A is a set, so 
there exists a w satisfying the condition 

(Vy)[y £ w <=$■ (3x e X)\y = F(x)]]; 

and by the Extensionality Axiom, only one such w can exist, so we can set 

G(X) = the unique w such that (Vj)[y £ w (3a- £ X)[y = F’(x)]]. 

11.7. 

! a, if n = 0, 

y, if n > 0& (n — 1, y) £ w & (Vt f }’)[{n — 1, t) £ w], 

0. otherwise. 

11.9. In these cases, it is easiest to verify |JM C M by inspection, after 
computing the relevant unionsets: 

U0 = 0; 

U{0,{0}} = {0}; 

U{0,{0}.{0.{0}}} = {0.{0}}; 

UNo = N 0 . 

Finally, if M is a class of atoms, then (J M = 0, so (J M C M . 

11.11. By the definition, A £ TC(A) and TC(A) is transitive, so that 
A C TC(A') and hence A U {A} C TC(A), for any A. For the other direction, 
suppose A is transitive and x £ A U {A}. If x £ A, then x C A C A U {A}, 
and if x = A, then again x C A C A U {A}\ which shows that A U {A} is 
transitive, and hence A U {A} C TC (A). 

11.13. If there are no atoms, then the transitive closure TC(A) of every set 
has no atoms, and so every set is pure. Conversely, if there exists some atom 
a, then TC({«}) = {{«}, a }, and so {a} is not pure. 

11.14. If A is transitive, then TC (A) — A U {A} by Exercise 11.11, and 
A U {A} is finite or countable exactly when A is finite or countable. 

11.16. False: the singleton {«} of an atom is transitive with TC({«}) = 
{{«}, a}, but it is not a subset of its powerset V{{a}) = {0. {«}}, all of whose 
members are (by definition) sets. 

11.17. The inclusion M n (l) C M n {J) is proved by a simple induction on 
n. with the basis M 0 (7) = / C J = M 0 {J) supplied by the hypothesis. 
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11 . 20 . The class W of all objects is transitive, because if y is an object 
and x G y, then x is also an object — simply because we have assumed that 
membership is a condition on pairs of objects; and it is a Zermelo universe, 
because we have assumed of it all the conditions we demand of a Zermelo 
universe — that it contains the unordered pair {x, y} of any two objects (in it), 
and the unionset (J X and powerset V(X) of any set (in it), and that it also 
contains a set / which satisfies the Axiom of Infinity, from which No can be 
constructed using the closure properties of W, as in the proof of Theorem 5 . 4 . 

For the second claim, notice first that No C M, since M is transitive and 
contains No, and so 0 G M, since 0 G No- Moreover, if 4 G M, then 
V{A) G M and so V{A) C M, which means that every subset of A is in M. 

11 . 22 . The Peano system constructed in the proof of Theorem 5.4 is a 
member of every Zermelo universe M , because by Proposition 11 . 21 , M is 
closed under all the operations we used in that proof to construct it. 

11 . 24 . By the Axiom of Choice, the hypothesis of (11-24) implies that there 
exists some / : A — > B such that for all x G A, P(x, f (x)). But the function 
space (A — > B) G M, and so/gM since M is transitive. 

11 . 27 . No descending G-chain can start with x if x has no members, which 
is why atoms and 0 are grounded. For N 0 , we use the fact that it is a Peano 
system with 0 = 0 and Sm = {/«}, and we prove by induction that for all 
m G No, there is no infinite, descending G-chain which starts with nr this 
is clear if m = 0, since 0 = 0, and if it is true of in, then it is also true of 
Sm = { m } . for which the alleged chain would have to start with 

{m} 9 m 9 • • • 

immediately yielding a chain which starts with m. 

If A is grounded, then so is each x G A: because if x 9 xi 9 • • • were an 
infinite, descending G-chain starting with some x G A, then A 9 x 9 x\ 9 • • • 
would be an infinite, descending G-chain starting with A\ and conversely, if 
every member of A is grounded, then we cannot have a chain A 9 x\ 9 • • • , 
because the tail x\ 9 • • • would witness that x\ is not grounded. 

Similarly, if A is grounded, then so is P(A): because any chain V{ A) 9 X 9 
xi 9 • • • would yield a chain xi 9 • • • starting with xi G A. And conversely, if 
V{A) is grounded, then so is A, because any chain V{A) 9 X 9 x 9 xj 9 • • • 
starting with V{ A) would yield a chain x 9 xi 9 • • • starting with some x G A. 

Finally, the class of all grounded sets is transitive, because, again, all ele- 
ments of a grounded set grounded. 

12 . 2 . For any function / : A — > B, the image /[0] of the empty set is 
empty; and so with v v : U — » ord(N), since 0^ has no predecessors, 

fu(0u) = v v [{y G TJ | y < Of/}] = v L 7 [0] = 0. 

By the definition of the successor operation on U , 

y <u S(x) <=>■ y <u x V y = x, 
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and using this, we can compute: 

vt/OSM) = {vu(y) I v <u S(x)} 

= {Vu(j) | V <u X} U {Vf/(.\')} 

= vi/U) U {v £/ (x)}. 

12.3. If 0 u is least in U , then 0 ( / < v x, and so 0 = r C /(0f/) G vj/(x). If 
a G vu(x), then a = vu(y), for some y <u x; but then y has a successor 
S(y) <u x, since x is a limit point, Vf/(5(j)) G vu(x), by the definition, and 
using the preceding Exercise 12.2, 

VtfOSGO) = v u{y) u {vf/OO} = a U {«} G v v (x). 


12.4. Let 


G = {X | 0 G X&{Va G X)[a U {a} G X]}. 

Since a>u is a limit point, vu(cou) G G by Exercise 12.3 — and, in particular, 
G is a non-empty class. For the converse, suppose towards a contradiction 
that there is some IgG such that vjj(cou) % X, and let y be least in U such 
that v v (y) ^ X. Now y is not the least element of U, since v c/ (0 [ /) = 0 G X, 
by the hypothesis on X. And y is not a limit point of U, since y < (»u'. 
hence y = S(x) for some x G U, vu{x) G X, by the choice of y, and 
v u(y) = vu(x) U {vj/U)} G X by the hypothesis on X, which contradicts the 
assumption on y. 

12.6. If n: U >— > V is an initial similarity, then, directly from Lemma 12.5, 
ord(t7) = {vj/(x) | x G U} = {vv{n{x)) | x G 17} C ord(F). 

12.8. If U = Succ((N, < N )) is the next well ordered set to (N, < N ) with t 
added on top, then t = cou, N = seg v (t) and ord(N, <n) = vu(t) = co . The 
rest follows immediately from Exercise 12.2. 

12.10. More precisely, the claim is that ord(a, <„) = a, and the proof is 
as follows: if a = ord(C), then (a, < a ) = 0 U by Lemma 12.9, and so by 
Exercise 12.6 


ord(a) = ord(C/) = a. 


12.13. If U = 0 V, then 

ord({7) = 0 V = 0 V = o ord(F) 

by (12-12), and so ord(ord( U)) = 0 ord(ord( F)) by (12-12) again, from which 
ord({7) = ord(F) follows by Exercise 12.10. 

For the second claim: if U = 0 a and U = 0 fl, then a = 0 and so, in the 
same way, 


a = ord(a) = ord(/?) = 
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12.16. If fi is the least member of W , then fi < a, for every a € S’, so that 
by Lemma 12.9, ^Ca; thus fi C f'|§f. On the other hand, if y £ fi, then 
y < fi, so y < a for every a € S’, i.e., y € a for every a € S, hence y £ f)S; 
thus y € ft => ]'€?, i.e., jSCf)g’. 

12.20. The successor poset Succ(a) is defined by adding a new element 
r £ a to the field a and placing it above all the members of a. Define 
n : a + 1 = a U {a} — > Succ(a) by 


n{x) 


x, if x € a, 

r, otherwise (i.e., if x = a). 


It follows immediately that n is a similarity, so that a + 1 =o Succ(a), and 
hence a. + 1 = ord(a + 1) = ord(Succ(a)). 


12.21. This is an elaboration of the preceding Exercise 12.20. By the 
definition of the sum of two posets in 7.37, we have 

a+g P = ({0} x a U {1} x /?, <), 
where the ordering < is defined by 

U, x ) < (j, y ) / < j V [/ = j & [x = y V x e v]], 

using the properties of the ordering on the ordinals. Fix a, and for each /?, 
define the function 


np : a + 0 P — > ON 
by 

7r(0, x) = x, 7r(l,j) = a + y; 

we show by ordinal induction on /? that up is a similarity of a + 0 ft with a + f, 
so that 


ord(a + 0 fi) = ord(a + fi) = a + fi. 

At the basis fi = 0, a + 0 0 = ({0} x a, <), so that 7i 0 (0, x) = x and this is 
obviously a similarity of a + 0 0 with a + 0 = a. 

At the induction step, if fi = y + 1 , then the required result is exactly the 
preceding Exercise 12.20. 

Finally, at the induction step when fi is a limit ordinal, the induction hy- 
pothesis gives us that for each y < fi, the function n y is a similarity of a + 0 y 
with a + y. It is obvious from the definition of these functions that 

y <3 < fi =$■ 7i y C 7 is', 

and from this we get easily that the union 

71 P = U y<p n y 

is a similarity of a + 0 fi with a + fi, as required. 

The results about associativity and (not)-commutativity follow now from 
the problems in Chapter 7, especially Problem x7.4. 
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Finally, (12-26) also follows from the main part of the exercise, because if 
7i : [3 > — * y is an initial similarity of /? onto a proper initial segment of y, then 
the function 


p((i . *)) 


(z, x), if i = 0. 
(z, 7z(x)), if i = 1 


is easily an initial similarity of a + 0 fi onto a proper, initial segment of a+ 0 y. 
12 . 22 . By the definition, 

a - 0 fl = {a x /?, <), 

where < is the “inverse lexicographic” ordering on the pairs, 

{x\,yi) < ( * 2 ,yi ) y i < yi v [y\ = y 2 &.x\ < x 2 ]. 

To illustrate a different method than the one we used in the preceding exercise, 
we prove for each fixed a by ordinal induction on fl. that 

a -o P =o a. ■ /?, 

which gives the required result by taking the ordinals of both sides. 

At the basis y? = 0, we have that a ■„ 0 is the empty poset, a ■ 0 = 0 = 0, so 
we have literal equality. 

At the successor induction step, f? = y + 1 = y U {yj, so 
a. x P = (a x y) U (a x M)- 

Notice that 

(a x y) fl (a x {}'}) = 0, 

because the pairs (x, y) in the second part have second member y, while every 
pair (x, y) in the first part has second member some y < y. Moreover, 

({(a, y) | x G a}, <) = 0 a 

by the trivial similarity p{x, y ) = x, and so, by the definition of poset addition, 

a - 0 P —o a - 0 y + 0 a; 

but then the induction hypothesis, the preceding Exercise 12.21 and the defi- 
nition of ordinal multiplication imply together that 

a- o p= 0 a-y + a = a- f) 

which is what we needed to show. 

For the limit case in the induction step, the induction hypothesis gives us 
for each y < ft a similarity 


7 i y : a - 0 y >— » a ■ y, 

which is an initial similarity of a - 0 y into a ■ [1 because a ■ y < a ■ fl, so that 
a ■ y is an initial segment of a ■ fi. We can now appeal to that fact that if V. V 
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are well ordered sets and U < 0 V, then there is exactly one initial similarity 
n : U >-► V by Problem x7.13. In the present case, this means that 

S < y < 0 => np\y = n y , 

i.e., these initial similarities “cohere”, so that their union is an initial similarity 
of a ■„ 0 into a ■ 0. 

U{7C,, | y < 0} = n : a ■„ 0 >-> a • 0. 

But this n is onto a ■ 0 since 

a- P = U y< fi a ■ 7 

by the definition of ordinal multiplication, and so n is a similarity and we have 
the desired a •„ y? = a a ■ /]. 

The results about associativity and (not)-commutativity follow now from 
Problems x7.9 and x7.8. 

Finally, if n : /? >-» y is a proper initial similarity, then so is (easily) the 
function p : a ■„ /? >-> a - 0 y defined by 

p(x,y) = (x,n(y)), 

and it establishes the required a ■„ /3 < 0 a ■„ y. 

Note. Another way to do the main part of this exercise is to show directly 
that for all a , /?, the map 

n(x, y) = x ■ y ((x, y) e a x ft) 

is a similarity of a - 0 (1 with a ■ />. The proof (by ordinal recursion on [’>) is 
very similar to the argument we just gave, but it involves a somewhat more 
detailed “chasing” of similarities. 

12.23. We prove the claimed identity by ordinal induction on y, simultane- 
ously for all a and />: i.e., we prove by induction on y that 

(Va, fi)[ot ■ (ft + y) = a ■ [1 + a ■ y]. 

The basis y = 0 is trivial. 

For y = d + 1 , compute: 

a-(jff + G5+l))=a-((j? + <5) + l) 

= a ■ (/? + (5) + a (def. of •) 

= a-fi + a- S + a (ind. hyp.) 

= a • 0 + a • {d + 1 ). 

Finally, for limit y, we observe first that 

U r,<y{0 + 7 } is a limit ordinal. 

because for each >) < y there is a C such that >1 < £ < y, and then 0+tj < 0+£ 
by (12-26). Using this to distinguish cases in the definition of multiplication, 
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we compute: 

a-{p + y) = a- (U ,, <y {P + >1 }) 

= U, ;<} ,{a •(/? + '?)} 

= U,< y {« • P + a ■ i} ( ind - h yp-) 

= a ■ P + a ■ y (def. of +), 

where for the last inference we have used (12-27) to infer that if r/ < £ < y, 
then a ■ rj < a ■ £ < a ■ y, so that a ■ y is a limit ordinal and a-y = (J K a ■ //. 

12.26. Suppose first re = (/i£ € ON)[£ = c A], and assume towards a 
contradiction that k = c a for some a < but then A = c a < n, contradict- 
ing the definition of re. For the converse (and the second claim), if re is not 
equinumerous with any a < re, then, re = |re|, since re = c re, and so re e Card„. 

12.27. If A = c B, then, for all d e ON, 

A = c £, *=* B = c 

and hence (juc G ON)[^4 = c c] = (fi£ € 0N)[5 = c d]. The third property of 
strong cardinal assignments is trivial when we assume the Axiom of Replace- 
ment: because {|A| | X e g 3 } is the image of W under the definite operation 
X i— > \X\, and so it is a set. 

12.30. If AC holds, then every set is equinumerous with an ordinal A, and 
then \A\ is a von Neumann cardinal, by definition. Conversely, if \A\ = K a . 
then A = c H a , and so A is well orderable. 

12.31. Granting AC, the Generalized Continuum Hypothesis is the claim 
that for every infinite cardinal number re, 

l'P(re)! = 2 K = re+; 

which becomes exactly the claimed identity since the cardinals are exactly the 
alephs and N+ = K a+ i. 
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